Skip to content
Advertisement

Count rows after joining three tables in PostgreSQL

Suppose I have three tables in PostgreSQL:

Suppose I am using the using the following query:

I get 50 as count.

Whereas with:

I get only 25 as count.

What is my mistake in the second query? What can I do to get the same count?

My requirement is that there is a single user table, referenced by multiple tables. I want to take the complete list of users and get the count of ids from different tables.

But the table on which I have joined alone returns the proper count but rest of them don’t return the proper count. Can anybody suggest a way to modify my second query to get the proper count?

Advertisement

Answer

To simplify your logic, aggregate first, join later.

Guessing missing details, this query would give you the exact count, how many times each user was referenced in table1 and table2 respectively for all users:

In particular, avoid multiple 1-n relationships multiplying each other when joined together:

To retrieve a single or few users only, LATERAL joins will be faster (Postgres 9.3+):

Explain perceived difference

The particular mismatch you report is due to the specifics of a FULL OUTER JOIN:

First, an inner join is performed. Then, for each row in T1 that does not satisfy the join condition with any row in T2, a joined row is added with null values in columns of T2. Also, for each row of T2 that does not satisfy the join condition with any row in T1, a joined row with null values in the columns of T1 is added.

So you get NULL values appended on the respective other side for missing matches. count() does not count NULL values. So you can get a different result depending on whether you filter on u1.id=100 or u2.id=100.

This is just to explain, you don’t need a FULL JOIN here. Use the presented alternatives instead.

User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement