Skip to content
Advertisement

Average on most recent date for which data is available for multiple columns

I have a table (on BigQuery) that looks like the following:

What I would like to get is the average score for each type but the average should be taken only on the most recent date for which at least one score is available for the type. From the example above, the aim is to obtain the following table in one query (that can contain subqueries):

I need a solution that could be adapted if I want the average score, not for each type, but for each combination of two columns (type/color), still on the most recent date for which at least one score is available for the combination. It should be also possible to adapt it if I have more score columns and for different aggregations (AVG/MAX/MIN…).

N.B.: A first question was asked to handle the same problem with only one score column: Average on the most recent date for which data is available.

Advertisement

Answer

You can follow the same approach, but you need to enumerate each column separately — so filtering doesn’t work. That would be:

Note: This includes the NULL values in the averages. There is no harm in that, because they don’t affect the results. You could also express this as:

User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement