Skip to content
Advertisement

PostgreSQL string_agg, default value when no value found

I have a table like this:

date    number      system_id
1       33.1        1
2       24.2        1
3       14.1        1
4       15.5        1
5       1113        1
1       4513        2
2       53.4        2
3       24.8        2
4       13.12       2
5       3333        2

that I transform in this format I need:

[

[date 1, number in date 1 (of system_id 1), number in date 1 (of system_id 2), number in date 1 (of system_id 3), ...],
[date 2, number 2 (of system_id 1), number 2 (of system_id 2), number 2 (of system_id 3), ...],
[date 3, number 3 (of system_id 1), number 3 (of system_id 2), number 3 (of system_id 3), ...],
[date 4, number 4 (of system_id 1), number 4 (of system_id 2), number 4 (of system_id 3), ...],
[...]

]

with this query:

SELECT date,
STRING_AGG(number::character varying, ',' order by system_id asc) as n
FROM MyTable 
GROUP BY date

So far so good, assuming all systems_ids have the same number of dates. However, this might not be the case. Let’s add a row to the db:

date    number      system_id
6       1234        2

My result now looks like this:

[

[1, 33.1, 4513],
[2, 24.2, 53.4],
...
[6, 1234]

]

This is really problematic because I rely on the order of the numbers to represent different systems_ids.

["date", "system_id_1", "system_id_2"]

How can I fill a default value when there’s not a value to aggregrate?

I need row 6 in my result to look like this [6, 0, 1234] or this [6, NaN, 1234] or whatever is considered a good practice would be fine. Though it is paramount that the order ["date", "system_id_1", "system_id_2"] is always respected.

Alternatively, it’d be ok to skip the whole row. To do so, I could call

SELECT 
  date, 
  STRING_AGG(number::character varying, ',' order by system_id asc) as n
  COUNT(date) as dates_count
FROM
MyTable 
GROUP BY date

And somehow exclude all rows where dates_count is not equal to MAX(dates_count). But how?

Advertisement

Answer

You can generate the rows using a cross join. Then bring in the data and aggregate:

select d.date,
       string_agg( coalesce(t.number::character varying, ''), ',' order by s.system_id asc) as numbers
from (select distinct system_id from mytable) s cross join
     (select distinct date from mytable) d left join
     mytable t
     on t.system_id = s.system_id and t.date = s.date
group by d.date;

I would also suggest that you use arrays rather than strings for this purpose.

EDIT:

Or, to skip the incomplete rows:

with ds as (
      select date,
             string_agg(number::character varying, ',' order by system_id asc) as n,
             count(*) as num_system_ids,
             max(count(*)) over () as max_num_system_ids
      from MyTable 
      group by date
     )
select ds.date, ds.n
from ds
where ds.num_system_ids = ds.max_num_system_ids;
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement