I have a table like this:
date number system_id 1 33.1 1 2 24.2 1 3 14.1 1 4 15.5 1 5 1113 1 1 4513 2 2 53.4 2 3 24.8 2 4 13.12 2 5 3333 2
that I transform in this format I need:
[ [date 1, number in date 1 (of system_id 1), number in date 1 (of system_id 2), number in date 1 (of system_id 3), ...], [date 2, number 2 (of system_id 1), number 2 (of system_id 2), number 2 (of system_id 3), ...], [date 3, number 3 (of system_id 1), number 3 (of system_id 2), number 3 (of system_id 3), ...], [date 4, number 4 (of system_id 1), number 4 (of system_id 2), number 4 (of system_id 3), ...], [...] ]
with this query:
SELECT date, STRING_AGG(number::character varying, ',' order by system_id asc) as n FROM MyTable GROUP BY date
So far so good, assuming all systems_ids
have the same number of dates
. However, this might not be the case. Let’s add a row to the db:
date number system_id 6 1234 2
My result now looks like this:
[ [1, 33.1, 4513], [2, 24.2, 53.4], ... [6, 1234] ]
This is really problematic because I rely on the order of the numbers to represent different systems_ids.
["date", "system_id_1", "system_id_2"]
How can I fill a default value when there’s not a value to aggregrate?
I need row 6 in my result to look like this [6, 0, 1234]
or this [6, NaN, 1234]
or whatever is considered a good practice would be fine. Though it is paramount that the order ["date", "system_id_1", "system_id_2"]
is always respected.
Alternatively, it’d be ok to skip the whole row. To do so, I could call
SELECT date, STRING_AGG(number::character varying, ',' order by system_id asc) as n COUNT(date) as dates_count FROM MyTable GROUP BY date
And somehow exclude all rows where dates_count
is not equal to MAX(dates_count)
. But how?
Advertisement
Answer
You can generate the rows using a cross join
. Then bring in the data and aggregate:
select d.date, string_agg( coalesce(t.number::character varying, ''), ',' order by s.system_id asc) as numbers from (select distinct system_id from mytable) s cross join (select distinct date from mytable) d left join mytable t on t.system_id = s.system_id and t.date = s.date group by d.date;
I would also suggest that you use arrays rather than strings for this purpose.
EDIT:
Or, to skip the incomplete rows:
with ds as ( select date, string_agg(number::character varying, ',' order by system_id asc) as n, count(*) as num_system_ids, max(count(*)) over () as max_num_system_ids from MyTable group by date ) select ds.date, ds.n from ds where ds.num_system_ids = ds.max_num_system_ids;