2017-01-02 4 views
-3
SELECT 
    STRFTIME_UTC_USEC(TimeStamp,"%Y-%m-%d %H:%M:%S") AS TimeStamp, 
    Value.provided, 
    __key__.app AS ProjectID, 
    REGEXP_EXTRACT(__key__.path, r'"hostname"[, ]*"(.*?)"') AS hostname, 
    REGEXP_EXTRACT(__key__.path, r'"machine"[, ]*"(.*?)"') AS machine, 
    REGEXP_EXTRACT(__key__.path, r'"variable"[, ]*"(.*?)"') AS variable, 
    IF(value.provided = 'integer', CAST(value.integer AS STRING),    
    CAST(value.boolean AS STRING)) AS value 
FROM 
    [spark-test-project-152415:spark_machine_learning.spark_12272016] 
ORDER BY 
    TimeStamp 
LIMIT 100000 

위 쿼리는 첨부 된 그림과 같이 데이터 집합을 추출합니다. 가변 열을 값이있는 여러 열로 분할해야합니다. 하위 쿼리를 사용하여 수행해야한다고 생각합니다. 어떻게 시작할 수 있습니까?열을 여러 열로 나누기

enter image description here

는 출력 예상 :

SELECT * FROM (SELECT #Timestamp, STRFTIME_UTC_USEC(TimeStamp,"%Y-%m-%d %H:%M:%S") AS [TimeStamp], Value.provided, __key__.app AS ProjectID, REGEXP_EXTRACT(__key__.path, r'"hostname"[, ]*"(.*?)"') AS [hostname], REGEXP_EXTRACT(__key__.path, r'"machine"[, ]*"(.*?)"') AS [machine], REGEXP_EXTRACT(__key__.path, r'"variable"[, ]*"(.*?)"') AS [variable], IF(value.provided = 'integer', CAST(value.integer AS STRING), CAST(value.boolean AS STRING)) AS [value] FROM [spark-test-project-152415:spark_machine_learning.spark_12272016] ORDER BY TimeStamp) AS SourceTable PIVOT ([value] FOR [variable] IN ([Counter_Strokes_No_Reset], [Press_State_Code], [Press_Operator_1], [Press_Stop_Time_Limit], [Counter_Good_Parts_No_Reset], [Press_Error_Reason_Code], [Counter_Scrap_No_Reset], [Production_Tool_Number], [Press_Stop_Time_Actual], [Production_Good_Parts_Preset], [Press_Shaft_Speed], [Production_Part_Number], [Press_Total_Tonnage], [Production_Job_Number])) AS PivotTable

+0

을 줄 수도, 아래에보십시오. 당신은 기대되는 결과의 예를 제공 할 수 있습니까! –

+0

https://i.stack.imgur.com/Xa1hg.png –

+0

@Mikhail, 위의 링크 –

답변

1

가 어떻게이 얻을 수 PIVOT과

expected output 쿼리가 시작?

는 내가 이미지가 현재의 출력을 보여줍니다 이해하는 것과 당신에게 아이디어

SELECT 
    [TimeStamp], 
    Value_Provided, 
    ProjectID, 
    hostname, 
    machine, 
    SUM(CASE WHEN variable = 'Counter_Strokes_No_Reset' THEN value.integer END) AS Counter_Strokes_No_Reset, 
    SUM(CASE WHEN variable = 'Press_State_Code' THEN value.integer END) AS Press_State_Code, 
    SUM(CASE WHEN variable = 'Press_Operator_1' THEN value.integer END) AS Press_Operator_1, 
    SUM(CASE WHEN variable = 'Press_Stop_Time_Limit' THEN value.integer END) AS Press_Stop_Time_Limit, 
    SUM(CASE WHEN variable = 'Counter_Good_Parts_No_Reset' THEN value.integer END) AS Counter_Good_Parts_No_Reset, 
    SUM(CASE WHEN variable = 'Press_Error_Reason_Code' THEN value.integer END) AS Press_Error_Reason_Code, 
    SUM(CASE WHEN variable = 'Counter_Scrap_No_Reset' THEN value.integer END) AS Counter_Scrap_No_Reset, 
    SUM(CASE WHEN variable = 'Production_Tool_Number' THEN value.integer END) AS Production_Tool_Number, 
    SUM(CASE WHEN variable = 'Press_Stop_Time_Actual' THEN value.integer END) AS Press_Stop_Time_Actual, 
    SUM(CASE WHEN variable = 'Production_Good_Parts_Preset' THEN value.integer END) AS Production_Good_Parts_Preset, 
    SUM(CASE WHEN variable = 'Press_Shaft_Speed' THEN value.integer END) AS Press_Shaft_Speed, 
    SUM(CASE WHEN variable = 'Production_Part_Number' THEN value.integer END) AS Production_Part_Number, 
    SUM(CASE WHEN variable = 'Press_Total_Tonnage' THEN value.integer END) AS Press_Total_Tonnage, 
    SUM(CASE WHEN variable = 'Production_Job_Number' THEN value.integer END) AS Production_Job_Number 
FROM (
    SELECT 
    STRFTIME_UTC_USEC(TIMESTAMP,"%Y-%m-%d %H:%M:%S") AS [TimeStamp], 
    Value.provided AS Value_Provided, 
    __key__.app AS ProjectID, 
    REGEXP_EXTRACT(__key__.path, r'"hostname"[, ]*"(.*?)"') AS [hostname], 
    REGEXP_EXTRACT(__key__.path, r'"machine"[, ]*"(.*?)"') AS [machine], 
    REGEXP_EXTRACT(__key__.path, r'"variable"[, ]*"(.*?)"') AS [variable], 
    IF(value.provided = 'integer', CAST(value.integer AS INTEGER), CAST(value.boolean AS INTEGER)) AS [value] 
    FROM [spark-test-project-152415:spark_machine_learning.spark_12272016] 
) 
GROUP BY [TimeStamp], Value_Provided, ProjectID, hostname, machine 
ORDER BY [TimeStamp] 
+0

감사합니다. @Mikhail이 원하는 출력을 얻을 수있었습니다. –