Skip to content
Advertisement

Postgres LEFT JOIN not returning nulls?

Not sure why this is happening, but I have this query:

SELECT vote.user_id, ARRAY_AGG(COALESCE(vote.value, 0)) FROM item
LEFT JOIN vote ON item.item_id = vote.item_id
GROUP BY vote.user_id
ORDER BY user_id;

And the item table has 100 elements.

I’m looking for the resulting rows to be grouped by user_id and have each array be 100 items long. That is, if the value is not present in the vote table, to just instead sub in a 0 in its place.

Unfortunately this is not happening. The arrays are all of different lengths, and only include the existing vote values.

I thought the LEFT JOIN would cause NULLs to be put in place of the missing vote values, and for the COALESCE to turn them to zeroes, but clearly this is not happening.

Advertisement

Answer

Your item table has 100 rows. You outer join the votes table. Thus you get all item votes, e.g. 10 votes for item 1, 5 votes for item 2, 0 votes for item 3, … If you used an inner join, you would lose the 0 votes for item 3, because there is no vote for item 3 in the table. Then, every vote is related to a user. For item 3 there is no vote, so while you produce a result for item 3 with no votes (NULL, which COALESCE turns into 0), this result is empty (i.e. the user ID is also NULL of course). That is all the outer join does.

Then you group by user and collect their votes in arrays. Let’s say user 1 has 40 votes, user 2 has 30 votes, user 3 has no votes, and user 4 has 20 votes. For this you’d get three result rows (one for each user in the data): one for user 1 with an array of 40 votes, one for user 2 with an array of 30 votes, one for user 4 with an array of 20 votes.

You want the arrays to magically contain 100 votes each. From this I gather that the votes table has a unique key on user and item and you want to select all users with a vote value for each item – a zero value in case there is no vote entry.

For this to happen you must first create all desired rows: users x items. Then outer join the votes.

SELECT u.user_id, ARRAY_AGG(COALESCE(v.value, 0) ORDER BY i.item_id)
FROM users u
CROSS JOIN items i
LEFT JOIN votes v ON v.user_id = u.user_id and v.item_id = i.item_id
GROUP BY u.user_id
ORDER BY u.user_id;

If you want to restrict this to users with at least one vote, then either replace FROM users u with FROM (SELECT DISTINCT user_id FROM votes) u or add HAVING COUNT(v.item_id) > 0.

User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement