So I have a JSON variable (var2), which is an array. I have the following data:
CREATE table if not exists DATASET ( var1 varchar(64) not null, var2 json NOT NULL ); insert into DATASET (var1,var2) values ('John','[{"group": "A", "dollars": 177, "machines":2},{"group": "B", "dollars": 300, "machines":5}]'), ('Max','[{"group": "B", "dollars": 122, "machines":3}]'), ('William','[{"group": "B", "dollars": 116, "machines":1},{"group": "A", "dollars": 500, "machines":6}]'), ('Sara','[{"group": "A", "dollars": 232, "machines":1}]');
I’m trying to write a query that gives back the number of dollars and the machines per group:
Example:
Group | dollars | machines group A | 909 | 9 group B | 538 | 9
I’m not used to JSON arrays and it’s kinda tricky for me.
I’ve tried many things and I can’t find much documentation with examples for JSON arrays. For instance, why does this not work?
SELECT var2 ->> 'group', COUNT(var2 ->> 'machines') AS CountM, SUM(var2 ->> 'dollars') AS SumD, FROM DATASET GROUP BY var2 ->> 'group'
Alson can someone recommend me a book or reference with stuff like this (with JSON arrays)?
Advertisement
Answer
You need to turn the array elements into rows (essentially normalizing your data model “on-the-fly”), then you can aggregate over them:
select x.element ->> 'group' as "Group", sum((x.element ->> 'dollars')::int) as dollars, sum((x.element ->> 'machines')::int) as machines from dataset cross join json_array_elements(var2) as x(element) group by "Group"