Skip to content
Advertisement

SQL COUNT items and GROUP over multiple categories with 0 totals for missing entries

I have a table in BigQuery I would like to aggregate over a number of different categories (columns in the table) to produce a summary of the data, which is a simple COUNT and GROUP BY statement. However, for this project I would like to produce a “complete” count of every permutation of the categories (the possible values of the categories are known ahead of time) with 0 totals for combinations without any entries for that permutation.

Here is the simple initial COUNT and GROUP statement:

where attribute_1 has possible values ('A', 'B', 'C') and attribute_2 has possible values ('X', 'Y', 'Z') which produces something like this:

However my goal is to end up with a result like this:

I found similar problems for a single category, so I can solve for that case by holding the other attributes constant with the following statement:

which produces:

Is it possible to solve for the general case I am interested in?

Advertisement

Answer

If I understand correctly, you can generate the rows using a cross join and then match to the existing values:

Here is a db<>fiddle. It uses Postgres, so there are some syntax tweaks for Postgres.

User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement