Suppose I have a denormalized table that includes an ID and a value that I need to count. Something like this:
x
Tree_ID | other columns | Count_If_True
------------------------------------------------
1 | other values | True
1 | other values | True
2 | other values | True
2 | other values | True
3 | other values | True
In this case, select Tree_ID, count(Count_If_True) from Table group by Tree_ID
would show:
Tree_ID | count(Count_If_True)
---------------------------------
1 | 2
2 | 2
3 | 1
But If I denormalize my table further with a join from an Apples
table (where every tree has multiple apples), it would look something like this:
Apple_ID | Tree_ID | other columns | Count_If_True
------------------------------------------------
1 | 1 | other values | True
2 | 1 | other values | True
3 | 1 | other values | True
4 | 1 | other values | True
5 | 1 | other values | True
6 | 1 | other values | True
7 | 2 | other values | True
8 | 2 | other values | True
9 | 2 | other values | True
10 | 2 | other values | True
11 | 2 | other values | True
12 | 2 | other values | True
13 | 2 | other values | True
14 | 2 | other values | True
15 | 3 | other values | True
16 | 3 | other values | True
17 | 3 | other values | True
18 | 3 | other values | True
19 | 3 | other values | True
This would inflate our count
to:
Tree_ID | count(Count_If_True)
---------------------------------
1 | 6
2 | 8
3 | 5
Is there a simple way (without a CTE, for example) to write a single query to get back the original count result before Apple_ID
s were introduced?
Advertisement
Answer
You need a distinct row identifier in the first table — perhaps that is among the other columns. It can be one or more columns. Then you can use count(distinct)
:
select tree_id,
count(distinct <unique row column>) filter (where count_if_true)
from t
group by tree_id;