I’m trying to match and align data, or resaid, count occurrences and then list for which values those occurrences occur.
Or, in a question: “How many times does each ID value occur, and for what names?”
For example, with this input
Name ID ------------- jim 123 jim 234 jim 345 john 123 john 345 jane 234 jane 345 jan 45678
I want the output to be:
count ID name name name ------------------------------------ 3 345 jim john jane 2 123 jim john 2 234 jim jane 1 45678 jan
Or similarly, the input could be (noticing that the ID values are not aligned),
jim john jane jan ---------------------------- 123 345 234 45678 234 123 345 345
but that seems to complicate things.
As close as I am to the desired results is in SQL, as
for ID, count(ID) from table group by (ID) order by count desc
which outputs
ID count ------------ 345 3 123 2 234 2 45678 1
I’ll appreciate help.
Advertisement
Answer
You seem to want a pivot. In SQL, you have to specify the number of columns in advance (unless you construct the query as a string).
But the idea is:
select ID, count(*) as cnt, max(case when seqnum = 1 then name end) as name_1, max(case when seqnum = 2 then name end) as name_2, max(case when seqnum = 3 then name end) as name_3 from (select t.*, row_number() over (partition by id order by id) as seqnum -- arbitrary ordering from table t ) t group by ID order by count desc;
If you have an unknown number of columns, you can aggregate the values into an array:
select ID, count(*) as cnt, array_agg(name order by name) as names from table t group by ID order by count desc