I want to count the distinct
day_number
over the past 30 days. However, the distinct function can’t be used with over
If I delete distinct
, it will give me the total count of the day_number
, but day_number
can have lots of duplicate. So that’s why I want to add distinct
.
select tr.*, count( distinct day_number) OVER (PARTITION BY ACCOUNT ORDER BY DAY_number range 29 PRECEDING) as result from table tr;
Can anyone please show me how to count the distinct numbers in the over(partition by..)
statement? Thanks in advance.
Advertisement
Answer
You can do this by first creating a column that only lists each id once, and then doing a range-count on that column, e.g.:
WITH sd AS (SELECT 1 ID, 10 val FROM dual UNION ALL SELECT 1 ID, 20 val FROM dual UNION ALL SELECT 2 ID, 30 val FROM dual UNION ALL SELECT 2 ID, 40 val FROM dual UNION ALL SELECT 4 ID, 50 val FROM dual UNION ALL SELECT 4 ID, 60 val FROM dual UNION ALL SELECT 6 ID, 70 val FROM dual) SELECT ID, val, COUNT(id_distinct) OVER (ORDER BY ID RANGE 3 PRECEDING) cnt_disinct_ids FROM (SELECT ID, val, CASE WHEN row_number() OVER (PARTITION BY ID ORDER BY val) = 1 THEN ID END id_distinct FROM sd); ID VAL CNT_DISINCT_IDS ---------- ---------- --------------- 1 10 1 1 20 1 2 30 2 2 40 2 4 50 3 4 60 3 6 70 2
ETA: proof that the above technique works for your data:
WITH your_table AS (SELECT 'ABCDE' account_sk, 23 day_sk FROM dual UNION ALL SELECT 'ABCDE' account_sk, 23 day_sk FROM dual UNION ALL SELECT 'ABCDE' account_sk, 24 day_sk FROM dual UNION ALL SELECT 'ABCDE' account_sk, 25 day_sk FROM dual UNION ALL SELECT 'ABCDE' account_sk, 53 day_sk FROM dual UNION ALL SELECT 'ABCDE' account_sk, 53 day_sk FROM dual UNION ALL SELECT 'ABCDE' account_sk, 55 day_sk FROM dual UNION ALL SELECT 'VWXYZ' account_sk, 10 day_sk FROM dual UNION ALL SELECT 'VWXYZ' account_sk, 12 day_sk FROM dual UNION ALL SELECT 'VWXYZ' account_sk, 40 day_sk FROM dual UNION ALL SELECT 'VWXYZ' account_sk, 40 day_sk FROM dual) SELECT account_sk, day_sk, COUNT(day_sk_distinct) OVER (PARTITION BY account_sk ORDER BY day_sk RANGE BETWEEN 29 PRECEDING AND CURRENT ROW) count_distinct_day_sk FROM (SELECT account_sk, day_sk, CASE WHEN row_number() OVER (PARTITION BY account_sk, day_sk ORDER BY day_sk) = 1 THEN day_sk END day_sk_distinct FROM your_table); ACCOUNT_SK DAY_SK COUNT_DISTINCT_DAY_SK ---------- ---------- --------------------- ABCDE 23 1 ABCDE 23 1 ABCDE 24 2 ABCDE 25 3 ABCDE 53 3 ABCDE 53 3 ABCDE 55 2 VWXYZ 10 1 VWXYZ 12 2 VWXYZ 40 2 VWXYZ 40 2