I am struggling with a sampling issue using Teradata
Below is the format of the data
ID Group Rank 1 dog 1 1 cat 1 1 lion 1 1 elephant 2 2 dog 1 2 cat 1 2 lion 1 2 elephant 1 3 dog 1 3 cat 2 3 lion 1 3 elephant 1 4 dog 2 4 cat 1 4 lion 1 4 elephant 1 ...
I would ideally like to return a sample number for each entry in Group but with only unique values from ID.
Below is the current query I produced but this returns duplicates for ID
SELECT ID, Group FROM Table WHERE rank = 1 SAMPLE WHEN group = 'dog' then 10 WHEN group = 'cat' then 10 WHEN group = 'elephant' then 5 WHEN group = 'lion' then 5 END
Advertisement
Answer
with cte as ( SELECT ID, Group, random(1,10000) as rnd -- RANDOM can't be directly used in OLAP-functions FROM Table WHERE rank = 1 ) SELECT ID, Group FROM cte QUALIFY ROW_NUMBER() -- get one random row per ID OVER (PARTITION BY ID ORDER BY rnd) = 1 SAMPLE WHEN group = 'dog' then 10 WHEN group = 'cat' then 10 WHEN group = 'elephant' then 5 WHEN group = 'lion' then 5 END