Not sure why this is happening, but I have this query:
SELECT vote.user_id, ARRAY_AGG(COALESCE(vote.value, 0)) FROM item LEFT JOIN vote ON item.item_id = vote.item_id GROUP BY vote.user_id ORDER BY user_id;
And the item table has 100 elements.
I’m looking for the resulting rows to be grouped by user_id
and have each array be 100 items long. That is, if the value is not present in the vote table, to just instead sub in a 0
in its place.
Unfortunately this is not happening. The arrays are all of different lengths, and only include the existing vote values.
I thought the LEFT JOIN
would cause NULL
s to be put in place of the missing vote values, and for the COALESCE
to turn them to zeroes, but clearly this is not happening.
Advertisement
Answer
Your item table has 100 rows. You outer join the votes table. Thus you get all item votes, e.g. 10 votes for item 1, 5 votes for item 2, 0 votes for item 3, … If you used an inner join, you would lose the 0 votes for item 3, because there is no vote for item 3 in the table. Then, every vote is related to a user. For item 3 there is no vote, so while you produce a result for item 3 with no votes (NULL, which COALESCE
turns into 0), this result is empty (i.e. the user ID is also NULL of course). That is all the outer join does.
Then you group by user and collect their votes in arrays. Let’s say user 1 has 40 votes, user 2 has 30 votes, user 3 has no votes, and user 4 has 20 votes. For this you’d get three result rows (one for each user in the data): one for user 1 with an array of 40 votes, one for user 2 with an array of 30 votes, one for user 4 with an array of 20 votes.
You want the arrays to magically contain 100 votes each. From this I gather that the votes table has a unique key on user and item and you want to select all users with a vote value for each item – a zero value in case there is no vote entry.
For this to happen you must first create all desired rows: users x items. Then outer join the votes.
SELECT u.user_id, ARRAY_AGG(COALESCE(v.value, 0) ORDER BY i.item_id) FROM users u CROSS JOIN items i LEFT JOIN votes v ON v.user_id = u.user_id and v.item_id = i.item_id GROUP BY u.user_id ORDER BY u.user_id;
If you want to restrict this to users with at least one vote, then either replace FROM users u
with FROM (SELECT DISTINCT user_id FROM votes) u
or add HAVING COUNT(v.item_id) > 0
.