2017-10-27 2 views
1

시간 기준 (코드 : 아래 참조)에 따라 Rfacebook으로 페이스 북 게시물을 추출하고 모든 결과 (예 : 데이터 프레임의 행)를 삭제하려는 경우 하나의 열 ("메시지")에는 키워드가 없습니다.키워드가 포함되지 않은 경우 행을 삭제하십시오.

grep 인 나의 유일한 해결책은 저에게 그 란의 내용만을 남기고 있습니다. 누군가 나를 도울 수 있습니까?

코드 :

# RETRIEVING DATA 
BBCpage <- getPage(page="bbcnews", token=fb_oauth, n=20, since="2017-05-03", feed=FALSE, reactions=TRUE, verbose=TRUE) 

BBCpage$message 
#Now I only want to keep the rows where the field "message" contains one of my keywords "Brexit" or "European Union" 

# possibility 1: not working, since I end up with ONLY the content of 'messages, not the entire row  
     pattern <- "Brexit|European Union" 
     grep(pattern, BBCpage, ignore.case=TRUE, perl = FALSE, value = TRUE, fixed = FALSE, useBytes = FALSE, invert = FALSE) 



# possibility 2: not working, no filter applied 
    matches <- c("Brexit", "European Union") 
    BBCfiltered <- BBCpage[!(BBCpage$message %in% matches), ] 

누군가가 제가 필터를 적용받을 수있는 방법을 알아내는 데 도움이 수 있습니까? 사전에

많은 감사,

이보

- 편집 : 다음 코드를 실행 : 여기에 출력됩니다 : 요청에 따라

BBCpage <- getPage(page="bbcnews", token=fb_oauth, n=20, since="2017-05-03", feed=FALSE, reactions=TRUE, verbose=TRUE) 

> dput(BBCpage) 
structure(list(id = c("228735667216_10155253874762217", "228735667216_10155253984962217", 
"228735667216_10155254016922217", "228735667216_1510422315708643", 
"228735667216_10155254242117217", "228735667216_10155254357457217", 
"228735667216_10155254531807217", "228735667216_10155254645177217", 
"228735667216_10155254739207217", "228735667216_10155254848077217", 
"228735667216_10155255021777217", "228735667216_10155255187982217", 
"228735667216_10155255303912217", "228735667216_10155255312537217", 
"228735667216_10155255092167217", "228735667216_10155256112042217", 
"228735667216_10155256182962217", "228735667216_10155256278057217", 
"228735667216_1993087934041388", "228735667216_10155256481732217" 
), likes_count = c(24996, 1385, 1280, 8870, 2104, 5906, 5813, 
15842, 9313, 3315, 944, 6485, 1638, 1638, 2045, 4356, 2098, 1305, 
237, 741), from_id = c("228735667216", "228735667216", "228735667216", 
"228735667216", "228735667216", "228735667216", "228735667216", 
"228735667216", "228735667216", "228735667216", "228735667216", 
"228735667216", "228735667216", "228735667216", "228735667216", 
"228735667216", "228735667216", "228735667216", "228735667216", 
"228735667216"), from_name = c("BBC News", "BBC News", "BBC News", 
"BBC News", "BBC News", "BBC News", "BBC News", "BBC News", "BBC News", 
"BBC News", "BBC News", "BBC News", "BBC News", "BBC News", "BBC News", 
"BBC News", "BBC News", "BBC News", "BBC News", "BBC News"), 
    message = c("The Catalan parliament votes to declare independence from Spain - as Madrid looks set to impose direct rule.", 
    "As Halloween approaches, we are revisiting a spooky American classic. Goosebumps books were a scary children's book series that have been around for 25 years. We were #LIVE with Tim Jacobus, the artist behind the creepy cover art.", 
    "Do hotel comparison sites really give you the best deal?", 
    "The first official exhibition about the late pop icon Prince has opened in London - with the help of his little sister. <ed><U+00A0><U+00BC><ed><U+00BE><U+00B8><U+2728> #MyNameisPrince\n\n(via BBC Entertainment News)", 
    "British-born novelist Christina Baker Kline says the ex-president \"squeezed my butt\" as she posed for a photo.", 
    "Ecstatic scenes in Barcelona as Catalonia’s parliament votes to declare independence from Spain - but Madrid has approved direct rule over the region.\n\nbbc.in/2zbEyCn", 
    "Her husband dropped her at a doctor's appointment in 1975 - and that was the last he ever heard of her.", 
    "<ed><U+00A0><U+00BC><ed><U+00BE><U+0083><ed><U+00A0><U+00BD><ed><U+00B0><U+00BE> No tricks, just treats for these animals at Halloween. <ed><U+00A0><U+00BD><ed><U+00B0><U+00BE><ed><U+00A0><U+00BC><ed><U+00BE><U+0083>", 
    "“We are pure. We are strong. We are brave. And we will fight.”\n\nRose McGowan's message to women in her first public remarks since accusing Harvey Weinstein of rape.", 
    "Downing Street said the declaration was based on an illegal vote. But The Scottish Government said it respected Catalonia's position.", 
    "Surely this should have been: \"Eleven things you need to know about Stranger Things\"... <ed><U+00A0><U+00BE><ed><U+00B4><U+00A6><U+200D><U+2640><U+FE0F>", 
    "\"You have no weight problems, that's the good news.\"\n\nPresident Donald J. Trump handed out Halloween treats and the odd trick to journalists' children on their trip to the Oval Office.", 
    "The actresses are the latest women to make allegations against film director James Toback.", 
    "\"Why are you asking me what I wore? It should not happen, no means no.\"", 
    "A pair of US speed climbers have cracked an \"unbeatable\" record for scaling one of the world's best known rock faces - El Capitan.", 
    "Cambridge University say the online repository has \"never seen numbers like this before\".", 
    "Spain's Deputy PM Soraya Saenz de Santamaria is put in charge of Catalonia after its government was dismissed.", 
    "Did you get enough sleep last night?", "\"Sometimes, I think coming into the studio with you John is a bit like going into Harvey Weinstein's bedroom.\"\n\nUK environment secretary Michael Gove apologises for what he says was his \"clumsy attempt at humour\" on a special edition of BBC Radio 4's Today programme. bbc.in/2idoZPk\n\n(Via BBC Politics)", 
    "Rescuers save caimans from a sticky situation in Brazil." 
    ), created_time = c("2017-10-27T13:36:37+0000", "2017-10-27T14:32:50+0000", 
    "2017-10-27T14:34:09+0000", "2017-10-27T15:20:00+0000", "2017-10-27T16:13:54+0000", 
    "2017-10-27T17:04:07+0000", "2017-10-27T17:53:05+0000", "2017-10-27T18:44:23+0000", 
    "2017-10-27T19:29:38+0000", "2017-10-27T20:21:24+0000", "2017-10-27T21:09:17+0000", 
    "2017-10-27T22:11:04+0000", "2017-10-27T22:45:09+0000", "2017-10-27T22:50:13+0000", 
    "2017-10-27T23:44:00+0000", "2017-10-28T07:15:39+0000", "2017-10-28T08:17:01+0000", 
    "2017-10-28T09:18:02+0000", "2017-10-28T10:28:12+0000", "2017-10-28T11:14:21+0000" 
    ), type = c("link", "video", "link", "video", "link", "video", 
    "link", "video", "video", "link", "link", "video", "link", 
    "link", "video", "link", "link", "link", "video", "video" 
    ), link = c("http://bbc.in/2zTuomQ", "https://www.facebook.com/bbcnews/videos/10155253984962217/", 
    "http://bbc.in/2y9oCAc", "https://www.facebook.com/bbcnews/videos/1510422315708643/", 
    "http://bbc.in/2ia2Q4M", "https://www.facebook.com/bbcnews/videos/10155254357457217/", 
    "http://bbc.in/2iaQ3if", "https://www.facebook.com/bbcnews/videos/10155254645177217/", 
    "https://www.facebook.com/bbcnews/videos/10155254739207217/", 
    "http://bbc.in/2zW9sLZ", "http://bbc.in/2z9SHQr", "https://www.facebook.com/bbcnews/videos/10155255187982217/", 
    "http://bbc.in/2zcSkVm", "http://bbc.in/2zUQc1E", "https://www.facebook.com/bbcnews/videos/10155255092167217/", 
    "http://bbc.in/2zelIu3", "http://bbc.in/2zfgXQY", "http://bbc.in/2ybP2S4", 
    "https://www.facebook.com/bbcnews/videos/1993087934041388/", 
    "https://www.facebook.com/bbcnews/videos/10155256481732217/" 
    ), story = c(NA, "BBC News was live.", NA, NA, NA, NA, NA, 
    NA, NA, NA, "BBC News shared BBC Entertainment News's post.", 
    NA, NA, NA, NA, NA, NA, NA, NA, NA), comments_count = c(1982, 
    412, 164, 2778, 1069, 963, 246, 727, 707, 896, 97, 3111, 
    198, 167, 232, 100, 385, 158, 147, 18), shares_count = c(10001, 
    198, 235, 2756, 262, 1677, 567, 4358, 1634, 602, 2, 1850, 
    75, 188, 363, 296, 231, 283, 33, 81), love_count = c(2294, 
    203, 23, 2224, 36, 625, NA, 2744, NA, 249, 83, NA, 55, 49, 
    94, NA, NA, NA, 8, NA), haha_count = c(549, 19, 67, 11, 697, 
    148, NA, 605, NA, 224, 26, NA, 24, 9, 4, NA, NA, NA, 73, 
    NA), wow_count = c(6987, 31, 66, 256, 169, 898, NA, 76, NA, 
    136, 7, NA, 101, 30, 249, NA, NA, NA, 13, NA), sad_count = c(392, 
    2, 1, 26, 85, 134, NA, 5, NA, 83, 1, NA, 218, 183, 1, NA, 
    NA, NA, 3, NA), angry_count = c(398, 17, 10, 6, 305, 183, 
    NA, 2, NA, 865, 0, NA, 32, 248, 2, NA, NA, NA, 61, NA)), .Names = c("id", 
"likes_count", "from_id", "from_name", "message", "created_time", 
"type", "link", "story", "comments_count", "shares_count", "love_count", 
"haha_count", "wow_count", "sad_count", "angry_count"), row.names = c(1L, 
2L, 3L, 19L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 12L, 13L, 14L, 11L, 
15L, 16L, 17L, 20L, 18L), class = "data.frame") 
> 

- 편집 2 : 주석 중 하나가 작동했습니다 (아래 답변 참조). 고마워요 r2evans

+0

'grepl'은 논리를 반환하거나'grep (... value = F)'는 일치하는 요소의 색인을 반환합니다. 두 가지 모두 행/열 색인 생성에 편리합니다. –

+0

페이지에서 가져올 필요없이 데이터를 제공 할 수 있습니까? 'dput (BBCpage)'의 출력을 복사하여 질문에 붙여 넣으십시오. – useR

+0

'BBCpage [grepl (pattern, BBCpage $ message, ...),]'? – r2evans

답변

1

제안은 r2evans에 의해 진행된 것으로 보입니다. 내가 코드를 약간 수정이했다 :

  BBC_page_relevant <- BBC_page[grepl(pattern, BBC_page$message, ...),] 

를이는 data.frame의 BBC_page_relevant에 관련 게시물을 저장, 작동하는 것 같다.

신속하고 유용한 답장을 보내 주셔서 감사합니다. 최고, Ivo

관련 문제