How to calculate a non-inflated count from a denormalized table

Suppose I have a denormalized table that includes an ID and a value that I need to count. Something like this:

 Tree_ID | ...other columns... |  Count_If_True
------------------------------------------------
       1 | ...other values...  |           True
       1 | ...other values...  |           True
       2 | ...other values...  |           True
       2 | ...other values...  |           True
       3 | ...other values...  |           True

​x
 
 Tree_ID | ...other columns... |  Count_If_True------------------------------------------------       1 | ...other values...  |           True       1 | ...other values...  |           True       2 | ...other values...  |           True       2 | ...other values...  |           True       3 | ...other values...  |           True​

In this case, select Tree_ID, count(Count_If_True) from Table group by Tree_ID would show:

 Tree_ID |  count(Count_If_True)
---------------------------------
       1 |                     2
       2 |                     2
       3 |                     1

 
 Tree_ID |  count(Count_If_True)---------------------------------       1 |                     2       2 |                     2       3 |                     1​

But If I denormalize my table further with a join from an Apples table (where every tree has multiple apples), it would look something like this:

Apple_ID | Tree_ID | ...other columns... |  Count_If_True
------------------------------------------------
       1 |       1 | ...other values...  |           True
       2 |       1 | ...other values...  |           True
       3 |       1 | ...other values...  |           True
       4 |       1 | ...other values...  |           True
       5 |       1 | ...other values...  |           True
       6 |       1 | ...other values...  |           True
       7 |       2 | ...other values...  |           True
       8 |       2 | ...other values...  |           True
       9 |       2 | ...other values...  |           True
      10 |       2 | ...other values...  |           True
      11 |       2 | ...other values...  |           True
      12 |       2 | ...other values...  |           True
      13 |       2 | ...other values...  |           True
      14 |       2 | ...other values...  |           True
      15 |       3 | ...other values...  |           True
      16 |       3 | ...other values...  |           True
      17 |       3 | ...other values...  |           True
      18 |       3 | ...other values...  |           True
      19 |       3 | ...other values...  |           True

 
Apple_ID | Tree_ID | ...other columns... |  Count_If_True------------------------------------------------       1 |       1 | ...other values...  |           True       2 |       1 | ...other values...  |           True       3 |       1 | ...other values...  |           True       4 |       1 | ...other values...  |           True       5 |       1 | ...other values...  |           True       6 |       1 | ...other values...  |           True       7 |       2 | ...other values...  |           True       8 |       2 | ...other values...  |           True       9 |       2 | ...other values...  |           True      10 |       2 | ...other values...  |           True      11 |       2 | ...other values...  |           True      12 |       2 | ...other values...  |           True      13 |       2 | ...other values...  |           True      14 |       2 | ...other values...  |           True      15 |       3 | ...other values...  |           True      16 |       3 | ...other values...  |           True      17 |       3 | ...other values...  |           True      18 |       3 | ...other values...  |           True      19 |       3 | ...other values...  |           True​

This would inflate our count to:

 Tree_ID |  count(Count_If_True)
---------------------------------
       1 |                     6
       2 |                     8
       3 |                     5

 
 Tree_ID |  count(Count_If_True)---------------------------------       1 |                     6       2 |                     8       3 |                     5​

Is there a simple way (without a CTE, for example) to write a single query to get back the original count result before Apple_IDs were introduced?

Answer

You need a distinct row identifier in the first table — perhaps that is among the other columns. It can be one or more columns. Then you can use count(distinct):

select tree_id,
       count(distinct <unique row column>) filter (where count_if_true)
from t
group by tree_id;

 
select tree_id,       count(distinct <unique row column>) filter (where count_if_true)from tgroup by tree_id;​

Advertisement

Answer