같은 필드 이름을해야합니까이 정말 tm
특정되지 않습니다 : 여기에
text<- c("Since I love to travel, this is what I rely on every time.",
"I got the rewards card for the no international transaction fee",
"I got the rewards card mainly for the flight perks",
"Very good card, easy application process, and no international
transaction fee",
"The customer service is outstanding!",
"My wife got the rewards card for the gift cards and international
transaction fee.She loves it")
df<- data.frame(text)
library(tm)
corpus<- Corpus(DataframeSource(df))
corpus<- tm_map(corpus, content_transformer(tolower))
corpus<- tm_map(corpus, removePunctuation)
corpus<- tm_map(corpus, removeWords, stopwords("english"))
corpus<- tm_map(corpus, stripWhitespace)
BigramTokenizer<-
function(x)
unlist(lapply(ngrams(words(x),2),paste,collapse=" "),use.names=FALSE)
dtm<- DocumentTermMatrix(corpus, control= list(tokenize= BigramTokenizer))
sparse<- removeSparseTerms(dtm,.80)
dtm2<- as.matrix(sparse)
dtm2
처럼 출력이 모습입니다 나는 추측한다. 어쨌든, 당신은 당신의 코드에서
collapse="_"
을 설정하거나 같은 사실 후 열 이름을 수정할 수 있습니다
colnames(dtm2) <- gsub(" ", "_", colnames(dtm2), fixed = TRUE)
dtm2
Terms
Docs got_rewards international_transaction rewards_card transaction_fee
1 0 0 0 0
2 1 1 1 1
3 1 0 1 0
4 0 1 0 1
5 0 0 0 0
6 1 1 1 0
변경 "'붕괴'에 ="_ "''붕괴 ="? – lukeA
그게 .. 고마워! – djacobs1216