2016-11-14 3 views
1

여러 열의 평균 값을 찾고 모든 열 값이 계산 된 평균보다 큰 행만 가져 오는 돼지 스크립트를 작성해야합니다.돼지에서 여러 조건을 사용하는 열 필터링

i2 = GROUP i1 all; 
i3 = FOREACH i2 GENERATE AVG(i1.user_followers_count) AS avg_user_followers_count, AVG(i1.avl_user_follower_following_ratio) AS avg_avl_user_follower_following_ratio, AVG(i1.user_total_liked) AS avg_user_total_liked, AVG(i1.user_total_posts) AS avg_user_total_posts, AVG(i1.user_total_public_lists) AS avg_user_total_public_lists, AVG(i1.avl_user_total_retweets) AS avg_avl_user_total_retweets, AVG(i1.avl_user_total_likes) AS avl_user_total_likes, AVG(i1.avl_user_total_replies) AS avg_avl_user_total_replies, AVG(i1.avl_user_engagements) AS avl_avl_user_engagements, AVG(i1.user_reply_to_reply_count) AS avg_user_reply_to_reply_count; 

top_inf = FILTER i1 BY (i1.user_followers_count > i3.avg_user_followers_count, i1.avl_user_total_retweets > i3. avg_avl_user_total_retweets, i1.avl_user_total_likes > i3.avg_avl_user_total_retweets); 

하지만이 오류가 발생합니다 : 내 스크립트입니다

ERROR 1200: <file user.pig, line 70, column 103> mismatched input '>' expecting RIGHT_PAREN 

여러 조건에 행을 필터링 할 수있는 올바른 방법은 무엇입니까?

답변

3

사용 조건을 분리하는

top_inf = FILTER i1 BY (i1.user_followers_count > i3.avg_user_followers_count) 
        AND (i1.avl_user_total_retweets > i3.avg_avl_user_total_retweets) 
        AND (i1.avl_user_total_likes > i3.avg_avl_user_total_retweets); 
관련 문제