PostgresSQL에서 Count를 기준으로 사용하는 방법

"account", "tax_year"및 기타 필드가있는 기존 table1이 있습니다. CONCAT (account, tax_year)의 빈도가 1이고 WHERE 절을 만날 때 table1의 레코드로 table2를 작성하려고합니다. 예를 들어PostgresSQL에서 Count를 기준으로 사용하는 방법

이 표는 다음과 같다 경우 :

account year 
aaa 2014 
bbb 2016 
bbb 2016 
ddd 2014 
ddd 2014 
ddd 2015

표 2는해야한다 : 나는 이틀에 보냈다

DROP TABLE IF EXISTS table1; 
CREATE table2 AS 
SELECT 
    account::text, 
    tax_year::text, 
    building_number, 
    imprv_type, 
    building_style_code, 
    quality, 
    quality_description, 
    date_erected, 
    yr_remodel, 
    actual_area, 
    heat_area, 
    gross_area, 
    CONCAT(account, tax_year) AS unq 
FROM table1 
WHERE imprv_type=1001 and date_erected>0 and date_erected IS NOT NULL and quality IS NOT NULL and quality_description IS NOT NULL and yr_remodel>0 and yr_remodel IS NOT NULL and heat_area>0 and heat_area IS NOT NULL 
GROUP BY account, 
    tax_year, 
    building_number, 
    imprv_type, 
    building_style_code, 
    quality, 
    quality_description, 
    date_erected, 
    yr_remodel, 
    actual_area, 
    heat_area, 
    gross_area, 
    unq 
HAVING COUNT(unq)=1;

: 여기

account year 
aaa 2014 
ddd 2015

내 스크립트입니다 하지만 여전히 올바른 방법을 찾아 낼 수는 없습니다. 도와 주셔서 감사합니다!

출처

2016-06-08 12B01

table1에 쌍 (account, tax_year)의 수를 사용하는 적절한 방법 :

select account, tax_year 
from table1 
where imprv_type=1001 -- and many more... 
group by account, tax_year 
having count(*) = 1;

그래서 당신이 시도해야는 :

create table table2 as 
select * 
from table1 
where (account, tax_year) in (
    select account, tax_year 
    from table1 
    where imprv_type=1001 -- and many more... 
    group by account, tax_year 
    having count(*) = 1 
    );

출처

2016-06-08 21:04:17 klin

감사합니다! 원본 테이블에는 11,755,200 개의 행과 71 개의 행이 있습니다. 쿼리는 20 시간 동안 실행되었으며 여전히 실행 중입니다. 이 데이터 세트의 크기를 분석하는 데 너무 오래 걸리는 것이 일반적입니까? 나는 Postgres에 익숙하지 않다 – 12B01

쿼리는 참으로 비싸다. 서버의 RAM이 부족하여 메모리 스와핑이 발생할 가능성이 큽니다. 테이블 크기에 따라 특별한 방법을 사용해야합니다 (예 : where 절을 사용하여 데이터를 더 작은 논리 부분으로 나누어 단계별로 실행합니다. – klin

COUNT() = 1는 NOT EXISTS(another with the same key fields) 동등하다 :

SELECT 
    account, tax_year 
    -- ... maybe more fields ... 
FROM table1 t1 
WHERE NOT EXISTS (SELECT * 
    FROM table1 nx 
    WHERE nx.account = t1.account -- same key field(s) 
    AND nx.tax_year = t1.tax_year 
    AND nx.ctid <> t1.ctid   -- but a different row! 
    );

참고 : I 복합 검색 키가 키 필드의 COUNT(CONCAT(account, tax_year) 연결을 대체.

출처

2016-06-08 19:47:07 wildplasser

는 빠른 답장을 보내 주셔서 감사합니다! 귀하의 질의는 빈도 ("account"& "tax_year")가 1 인 레코드뿐만 아니라 모든 고유 레코드를 반환 할 것이라고 생각합니다. 예를 들어, 내 질문에 table1을 사용하면 NOT EXISTS는 aaa 2014, bbb 2016, ddd 2014, ddd 2015를 반환합니다.하지만 실제로 필요한 것은 aaa 2014 및 ddd2015입니다. – 12B01

where 절에 추가 조건을 추가 할 수 있습니다 (참고 :이 방법으로 집계 함수를 사용하지 않으므로 * ** GROUP BY가 필요하지 않습니다.) – wildplasser

PostgresSQL에서 Count를 기준으로 사용하는 방법

답변

관련 문제