Skip to content
Advertisement

SQL Group by JSON arrays

So I have a JSON variable (var2), which is an array. I have the following data:

 CREATE table if not exists DATASET (
 var1 varchar(64) not null,
 var2 json NOT NULL
 );

 insert into DATASET (var1,var2) values
 ('John','[{"group": "A", "dollars": 177, "machines":2},{"group": "B", "dollars": 300, 
 "machines":5}]'),
 ('Max','[{"group": "B", "dollars": 122, "machines":3}]'),
 ('William','[{"group": "B", "dollars": 116, "machines":1},{"group": "A", "dollars": 500, 
 "machines":6}]'),
 ('Sara','[{"group": "A", "dollars": 232, "machines":1}]');

I’m trying to write a query that gives back the number of dollars and the machines per group:

Example:

Group   | dollars | machines
group A |  909    |   9
group B |  538    |   9

I’m not used to JSON arrays and it’s kinda tricky for me.

I’ve tried many things and I can’t find much documentation with examples for JSON arrays. For instance, why does this not work?

SELECT var2 ->> 'group', COUNT(var2 ->> 'machines') AS CountM, SUM(var2 ->> 'dollars') AS SumD,
FROM DATASET
GROUP BY var2 ->> 'group'

Alson can someone recommend me a book or reference with stuff like this (with JSON arrays)?

Advertisement

Answer

You need to turn the array elements into rows (essentially normalizing your data model “on-the-fly”), then you can aggregate over them:

select x.element ->> 'group' as "Group",
       sum((x.element ->> 'dollars')::int) as dollars,
       sum((x.element ->> 'machines')::int) as machines
from dataset
  cross join json_array_elements(var2) as x(element)
group by "Group"  
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement