2017-02-02 1 views
-1

나는 하이브 테이블을 가지고 있습니다 은 다른 테이블에 쓰려고하는 텍스트로 저장되었습니다 tweetsORC 이것은 ORC입니다. 모두 동일한 구조를 가지고 :HIVE에서 구조체로 OVERWRITE를 INSERT하는 방법은 무엇입니까?

col_name data_type comment 
racist     boolean     from deserializer 
contributors   string     from deserializer 
coordinates    string     from deserializer 
created_at    string     from deserializer 
entities    struct<hashtags:array<string>,symbols:array<string>,urls:array<struct<display_url:string,expanded_url:string,indices:array<tinyint>,url:string>>,user_mentions:array<string>> from deserializer 
favorite_count   tinyint     from deserializer 
favorited    boolean     from deserializer 
filter_level   string     from deserializer 
geo      string     from deserializer 
id      bigint     from deserializer 
id_str     string     from deserializer 
in_reply_to_screen_name string     from deserializer 
in_reply_to_status_id string     from deserializer 
in_reply_to_status_id_str string     from deserializer 
in_reply_to_user_id  string     from deserializer 
in_reply_to_user_id_str string     from deserializer 
is_quote_status   boolean     from deserializer 
lang     string     from deserializer 
place     string     from deserializer 
possibly_sensitive  boolean     from deserializer 
retweet_count   tinyint     from deserializer 
retweeted    boolean     from deserializer 
source     string     from deserializer 
text     string     from deserializer 
timestamp_ms   string     from deserializer 
truncated    boolean     from deserializer 
user     struct<contributors_enabled:boolean,created_at:string,default_profile:boolean,default_profile_image:boolean,description:string,favourites_count:tinyint,follow_request_sent:string,followers_count:tinyint,following:string,friends_count:tinyint,geo_enabled:boolean,id:bigint,id_str:string,is_translator:boolean,lang:string,listed_count:tinyint,location:string,name:string,notifications:string,profile_background_color:string,profile_background_image_url:string,profile_background_image_url_https:string,profile_background_tile:boolean,profile_image_url:string,profile_image_url_https:string,profile_link_color:string,profile_sidebar_border_color:string,profile_sidebar_fill_color:string,profile_text_color:string,profile_use_background_image:boolean,protected:boolean,screen_name:string,statuses_count:smallint,time_zone:string,url:string,utc_offset:string,verified:boolean> from deserializer 

내가 얻을 tweetsORC에 트윗에서 삽입하려고 :

INSERT OVERWRITE TABLE tweetsORC SELECT * FROM tweets; 
FAILED: NoMatchingMethodException No matching method for class org.apache.hadoop.hive.ql.udf.UDFToString with (struct<hashtags:array<string>,symbols:array<string>,urls:array<struct<display_url:string,expanded_url:string,indices:array<tinyint>,url:string>>,user_mentions:array<string>>). Possible choices: _FUNC_(bigint) _FUNC_(binary) _FUNC_(boolean) _FUNC_(date) _FUNC_(decimal(38,18)) _FUNC_(double) _FUNC_(float) _FUNC_(int) _FUNC_(smallint) _FUNC_(string) _FUNC_(timestamp) _FUNC_(tinyint) _FUNC_(void) 

내가 이런 종류의 문제에 발견 한 유일한 도움은 UDF가 원시적 사용하게 말한다 유형이 있지만 UDF를 사용하지 않습니다! 어떤 도움을 많이 주시면 감사하겠습니다!

참고 : 하이브 버전 :

하이브 1.2.1000.2.4.2.0-258 서브 버전의 자식 : // U12-노예 5708dfcd-10/그리드/0/젠킨스/작업/HDP-빌드 ubuntu12 /bigtop/output/hive/hive-1.2.1000.2.4.2.0 -r 240760457150036e13035cbb82bcda0c65362f3a

편집 : 테이블 작성 및 샘플 데이터 :

create table tweets (
    contributors string, 
    coordinates string, 
    created_at string, 
    entities struct < 
    hashtags: array <string>, 
    symbols: array <string>, 
    urls: array <struct < 
     display_url: string, 
     expanded_url: string, 
     indices: array <tinyint>, 
     url: string>>, 
    user_mentions: array <string>>, 
    favorite_count tinyint, 
    favorited boolean, 
    filter_level string, 
    geo string, 
    id bigint, 
    id_str string, 
    in_reply_to_screen_name string, 
    in_reply_to_status_id string, 
    in_reply_to_status_id_str string, 
    in_reply_to_user_id string, 
    in_reply_to_user_id_str string, 
    is_quote_status boolean, 
    lang string, 
    place string, 
    possibly_sensitive boolean, 
    retweet_count tinyint, 
    retweeted boolean, 
    source string, 
    text string, 
    timestamp_ms string, 
    truncated boolean, 
    `user` struct < 
    contributors_enabled: boolean, 
    created_at: string, 
    default_profile: boolean, 
    default_profile_image: boolean, 
    description: string, 
    favourites_count: tinyint, 
    follow_request_sent: string, 
    followers_count: tinyint, 
    `following`: string, 
    friends_count: tinyint, 
    geo_enabled: boolean, 
    id: bigint, 
    id_str: string, 
    is_translator: boolean, 
    lang: string, 
    listed_count: tinyint, 
    location: string, 
    name: string, 
    notifications: string, 
    profile_background_color: string, 
    profile_background_image_url: string, 
    profile_background_image_url_https: string, 
    profile_background_tile: boolean, 
    profile_image_url: string, 
    profile_image_url_https: string, 
    profile_link_color: string, 
    profile_sidebar_border_color: string, 
    profile_sidebar_fill_color: string, 
    profile_text_color: string, 
    profile_use_background_image: boolean, 
    protected: boolean, 
    screen_name: string, 
    statuses_count: smallint, 
    time_zone: string, 
    url: string, 
    utc_offset: string, 
    verified: boolean> 
) 
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe' 
STORED AS TEXTFILE; 
LOAD DATA LOCAL INPATH '/home/ed/Downloads/hive-json-master/1abbo.txt' OVERWRITE INTO TABLE tweets; 

create table tweetsORC (
racist boolean, 
    contributors string, 
    coordinates string, 
    created_at string, 
    entities struct < 
    hashtags: array <string>, 
    symbols: array <string>, 
    urls: array <struct < 
     display_url: string, 
     expanded_url: string, 
     indices: array <tinyint>, 
     url: string>>, 
    user_mentions: array <string>>, 
    favorite_count tinyint, 
    favorited boolean, 
    filter_level string, 
    geo string, 
    id bigint, 
    id_str string, 
    in_reply_to_screen_name string, 
    in_reply_to_status_id string, 
    in_reply_to_status_id_str string, 
    in_reply_to_user_id string, 
    in_reply_to_user_id_str string, 
    is_quote_status boolean, 
    lang string, 
    place string, 
    possibly_sensitive boolean, 
    retweet_count tinyint, 
    retweeted boolean, 
    source string, 
    text string, 
    timestamp_ms string, 
    truncated boolean, 
    `user` struct < 
    contributors_enabled: boolean, 
    created_at: string, 
    default_profile: boolean, 
    default_profile_image: boolean, 
    description: string, 
    favourites_count: tinyint, 
    follow_request_sent: string, 
    followers_count: tinyint, 
    `following`: string, 
    friends_count: tinyint, 
    geo_enabled: boolean, 
    id: bigint, 
    id_str: string, 
    is_translator: boolean, 
    lang: string, 
    listed_count: tinyint, 
    location: string, 
    name: string, 
    notifications: string, 
    profile_background_color: string, 
    profile_background_image_url: string, 
    profile_background_image_url_https: string, 
    profile_background_tile: boolean, 
    profile_image_url: string, 
    profile_image_url_https: string, 
    profile_link_color: string, 
    profile_sidebar_border_color: string, 
    profile_sidebar_fill_color: string, 
    profile_text_color: string, 
    profile_use_background_image: boolean, 
    protected: boolean, 
    screen_name: string, 
    statuses_count: smallint, 
    time_zone: string, 
    url: string, 
    utc_offset: string, 
    verified: boolean> 
) 
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe' 
STORED AS ORC tblproperties ("orc.compress"="ZLIB"); 

데이터 here.

+0

문제를 복제하기 위해 create table과 some data sample을 제공 할 수 있습니까? – hlagos

+0

안녕하세요. 호수. 나는 멀리 있지만 나는 오늘 밤 그것을 시도하고 편집 할 것이다. – schoon

+0

완료! 그리고 고마워. – schoon

답변

0

Select *를 사용하는 대신 필드를 이름순으로 나열하면 오류가 발생합니다.

관련 문제