2013-03-27 5 views
1

R 초보자, 제발 내 무지를 용서해주십시오. 내 데이터는 다음과 같습니다한 열의 범주 값을 다른 열로 그룹화

                 JOB_ROLE  EXP_IT_NETW 
1 Software engineering-related (developer, tester, project manager, architecture)  5<10 
3                  See below  None 
4                   Student   <1 
5 Software engineering-related (developer, tester, project manager, architecture)   1<5 
6                   Blogger   10+ 

나는 결과이 같은 모양으로, 열 1 열 2와 그룹 카운트에서 각 값의 인스턴스를 계산하고 싶습니다 :

JOB_ROLE   None <1 1<5 5<10 10+ 
Software engineer 3  5  10  15  3 
Student    10  7  5  1  0 
... 

을 이 작업을 수행하는 방법에 대한 아이디어가 있습니까? 내 출력은 아래와 같습니다. 미리 감사드립니다!

structure(list(JOB_ROLE = c("Software engineering-related (developer, tester, project manager, architecture)", 
"See below", "Student", "Software engineering-related (developer, tester, project manager, architecture)", 
"Blogger", "Systems Support", "Student", "IT/Network Administrator", 
"Software engineering-related (developer, tester, project manager, architecture)", 
"Student", "Student", "Software engineering-related (developer, tester, project manager, architecture)", 
"IT hobbyist", "Student", "Software engineering-related (developer, tester, project manager, architecture)", 
"Software engineering-related (developer, tester, project manager, architecture)", 
"IT Manager", "Software engineering-related (developer, tester, project manager, architecture)", 
"Software engineering-related (developer, tester, project manager, architecture)", 
"Software engineering-related (developer, tester, project manager, architecture)", 
"Software engineering-related (developer, tester, project manager, architecture)", 
"IT/Network Administrator", "IT/Network Administrator", "Software engineering-related (developer, tester, project manager, architecture)", 
"Software engineering-related (developer, tester, project manager, architecture)", 
"Student", "Software engineering-related (developer, tester, project manager, architecture)", 
"Researcher in CompSci or related field", "Researcher in CompSci or related field", 
"IT/Network Administrator", "Student", "Software engineering-related (developer, tester, project manager, architecture)", 
"Software engineering-related (developer, tester, project manager, architecture)", 
"Software engineering-related (developer, tester, project manager, architecture)", 
"Software engineering-related (developer, tester, project manager, architecture)", 
"Education", "Software engineering-related (developer, tester, project manager, architecture)", 
"Software engineering-related (developer, tester, project manager, architecture)", 
"IT/Network Administrator", "Software engineering-related (developer, tester, project manager, architecture)", 
"IT/Network Administrator", "Student", "IT/Network Administrator", 
"Software engineering-related (developer, tester, project manager, architecture)", 
"Student", "IT/Network Administrator", "just a layperson who has used computers for over 30 years", 
"IT/Network Administrator", "Unemployed", "Student", "IT/Network Administrator" 
), EXP_IT_NETW = c("5<10", "None", "<1", "1<5", "10+", "None", 
"1<5", "10+", "<1", "None", "1<5", "1<5", "None", "None", "10+", 
"None", "1<5", "10+", "None", "1<5", "None", "1<5", "10+", "1<5", 
"1<5", "1<5", "None", "None", "1<5", "5<10", "None", "5<10", 
"<1", "None", "1<5", "None", "1<5", "1<5", "10+", "1<5", "10+", 
"None", "1<5", "5<10", "None", "1<5", "None", "1<5", "None", 
"None", "10+")), .Names = c("JOB_ROLE", "EXP_IT_NETW"), class = "data.frame", row.names = c(1L, 
3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 16L, 17L, 18L, 
19L, 20L, 21L, 22L, 23L, 25L, 26L, 27L, 28L, 29L, 30L, 32L, 33L, 
34L, 35L, 36L, 37L, 39L, 40L, 41L, 42L, 43L, 44L, 47L, 48L, 49L, 
50L, 51L, 52L, 53L, 55L, 56L, 57L, 59L, 61L, 62L)) 

답변

5

사용 table :

물론
> table(d) 
                       EXP_IT_NETW 
JOB_ROLE                   <1 1<5 10+ 5<10 None 
    Blogger                   0 0 1 0 0 
    Education                  0 0 0 0 1 
    IT hobbyist                  0 0 0 0 1 
    IT Manager                  0 1 0 0 0 
    IT/Network Administrator               0 4 5 1 0 
    just a layperson who has used computers for over 30 years      0 0 0 0 1 
    Researcher in CompSci or related field           0 1 0 0 1 
    See below                  0 0 0 0 1 
    Software engineering-related (developer, tester, project manager, architecture) 2 9 2 3 5 
    Student                   1 3 0 0 6 
    Systems Support                 0 0 0 0 1 
    Unemployed                  0 0 0 0 1 
+0

, 대답이 모두 함께 내 얼굴 바로 앞에 그래서 간단하다. 고맙습니다. – user2145843

+2

@Arun, 이것을 사용하십시오 :'as.data.frame (unclass (table (d)))' –

3

나는 또한 사용하십시오 data.table하지만 당신이 기대하는 동일한 형식을 얻을 다른 비트.

require(data.table) 
dt <- data.table(df) # here, I assume df is your data.frame 

setkey(dt, "JOB_ROLE") # setkey for fast access/grouping 

dt[, {tt <- table(factor(EXP_IT_NETW, 
       levels=factor(unique(dt$EXP_IT_NETW)))); 
     setattr(as.list(tt), 'names', names(tt)) 
     }, by = key(dt)] 

내가 얻을이 :

#         JOB_ROLE None 10+ 1<5 5<10 <1 
# 1:     >30_years_experience 1 0 0 0 0 
# 2:        Blogger 0 1 0 0 0 
# 3:        Education 1 0 0 0 0 
# 4:        IT Manager 0 0 1 0 0 
# 5:       IT hobbyist 1 0 0 0 0 
# 6:    IT/Network Administrator 0 5 4 1 0 
# 7: Researcher in CompSci or related field 1 0 1 0 0 
# 8:        See below 1 0 0 0 0 
# 9:     Software_enginnering 5 2 9 3 2 
# 10:        Student 6 0 3 0 1 
# 11:      Systems Support 1 0 0 0 0 
# 12:        Unemployed 1 0 0 0 0 
관련 문제