Skip to content
Advertisement

How to calculate a non-inflated count from a denormalized table

Suppose I have a denormalized table that includes an ID and a value that I need to count. Something like this:

 Tree_ID | ...other columns... |  Count_If_True
------------------------------------------------
       1 | ...other values...  |           True
       1 | ...other values...  |           True
       2 | ...other values...  |           True
       2 | ...other values...  |           True
       3 | ...other values...  |           True

In this case, select Tree_ID, count(Count_If_True) from Table group by Tree_ID would show:

 Tree_ID |  count(Count_If_True)
---------------------------------
       1 |                     2
       2 |                     2
       3 |                     1

But If I denormalize my table further with a join from an Apples table (where every tree has multiple apples), it would look something like this:

Apple_ID | Tree_ID | ...other columns... |  Count_If_True
------------------------------------------------
       1 |       1 | ...other values...  |           True
       2 |       1 | ...other values...  |           True
       3 |       1 | ...other values...  |           True
       4 |       1 | ...other values...  |           True
       5 |       1 | ...other values...  |           True
       6 |       1 | ...other values...  |           True
       7 |       2 | ...other values...  |           True
       8 |       2 | ...other values...  |           True
       9 |       2 | ...other values...  |           True
      10 |       2 | ...other values...  |           True
      11 |       2 | ...other values...  |           True
      12 |       2 | ...other values...  |           True
      13 |       2 | ...other values...  |           True
      14 |       2 | ...other values...  |           True
      15 |       3 | ...other values...  |           True
      16 |       3 | ...other values...  |           True
      17 |       3 | ...other values...  |           True
      18 |       3 | ...other values...  |           True
      19 |       3 | ...other values...  |           True

This would inflate our count to:

 Tree_ID |  count(Count_If_True)
---------------------------------
       1 |                     6
       2 |                     8
       3 |                     5

Is there a simple way (without a CTE, for example) to write a single query to get back the original count result before Apple_IDs were introduced?

Advertisement

Answer

You need a distinct row identifier in the first table — perhaps that is among the other columns. It can be one or more columns. Then you can use count(distinct):

select tree_id,
       count(distinct <unique row column>) filter (where count_if_true)
from t
group by tree_id;
User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement