ngram에서 가져온 여러 텍스트의 목록을 얻었으므로 원본 datatable에 열로 추가하고 싶습니다.ngram 텍스트가 R에서 별도의 열로 표시됩니다.
> prep_test
prep_test
1: Women Athletic,Athletic Apparel,Apparel Pants,Pants Tights,Tights Leggings
2: Beauty Makeup,Makeup Face
3: Beauty Makeup,Makeup Face
4: Electronics Cell,Cell Phones,Phones Accessories,Accessories Cases,Cases Covers,Covers Skins
5: Women Shoes,Shoes Boots
6: Men Men,Men s,s Accessories,Accessories Belts
7: Electronics Cell,Cell Phones,Phones Accessories,Accessories Cell,Cell Phones,Phones Smartphones
8: Women Tops,Tops Blouses,Blouses Other
9: Women Athletic,Athletic Apparel,Apparel Pants,Pants Tights,Tights Leggings
10: Home Home,Home DÃ,DÃ cor,cor Home,Home Fragrance
str(prep_test)
Classes ‘data.table’ and 'data.frame': 10 obs. of 1 variable:
$ prep_test:List of 10
..$ : chr "Women Athletic" "Athletic Apparel" "Apparel Pants" "Pants Tights" ...
..$ : chr "Beauty Makeup" "Makeup Face"
..$ : chr "Beauty Makeup" "Makeup Face"
..$ : chr "Electronics Cell" "Cell Phones" "Phones Accessories" "Accessories Cases" ...
..$ : chr "Women Shoes" "Shoes Boots"
..$ : chr "Men Men" "Men s" "s Accessories" "Accessories Belts"
..$ : chr "Electronics Cell" "Cell Phones" "Phones Accessories" "Accessories Cell" ...
..$ : chr "Women Tops" "Tops Blouses" "Blouses Other"
..$ : chr "Women Athletic" "Athletic Apparel" "Apparel Pants" "Pants Tights" ...
..$ : chr "Home Home" "Home DÃ" "DÃ cor" "cor Home" ...
- attr(*, ".internal.selfref")=<externalptr>
현재 코드 여기
bigram_fun <- function(y){
y <- gsub("[[:punct:][:blank:]]+", " ", y)
y <- ngram_asweka(y, min=2, max=2)
#y <- str_split_fixed(y, ",", n=Inf)
#y <- unlist(y)
return(y)
}
prep_test <- all[1:10, 9]
prep_test <- apply(prep_test, 1, bigram_fun)
prep_test <- data.table(prep_test)
prep_test
dput
> dput(prep_test)
list(c("Women Athletic", "Athletic Apparel", "Apparel Pants",
"Pants Tights", "Tights Leggings"), c("Beauty Makeup", "Makeup Face"
), c("Beauty Makeup", "Makeup Face"), c("Electronics Cell", "Cell Phones",
"Phones Accessories", "Accessories Cases", "Cases Covers", "Covers Skins"
), c("Women Shoes", "Shoes Boots"), c("Men Men", "Men s", "s Accessories",
"Accessories Belts"), c("Electronics Cell", "Cell Phones", "Phones Accessories",
"Accessories Cell", "Cell Phones", "Phones Smartphones"), c("Women Tops",
"Tops Blouses", "Blouses Other"), c("Women Athletic", "Athletic Apparel",
"Apparel Pants", "Pants Tights", "Tights Leggings"), c("Home Home",
"Home DÃ", "DÃ cor", "cor Home", "Home Fragrance"))
원하는 결과 열에 대해 N-g를 생성
Bigram 1 Bigram 2 Bigram 3 Bigram 4 ...
"Women Athletic" "Athletic Apparel" "Apparel Pants" "Pants Tights"...
"Beauty Makeup" "Makeup Face" NA NA ...
"Beauty Makeup" "Makeup Face" NA NA ...
"Electronics Cell" "Cell Phones" "Phones Accessories" "Accessories Cases"
"Women Shoes" "Shoes Boots" NA NA
이 작동합니다 여기
업로드'데이터의 dput' 코드의 재현이라고 – Chris
'prep_test' 귀하의 질문에 data.table 객체가 그래서. 그러나'dput'에는 데이터 테이블이 아닌 목록이 들어 있습니다. 내가 놓친 게 있니? – jazzurro