Average on most recent date for which data is available for multiple columns

Question

I have a table (on BigQuery) that looks like the following: What I would like to get is the average score for each type but the average should be taken only on the most recent date for which at least one score is available for the type. From the example above, the aim is to obtain the following table in

Accepted Answer

You can follow the same approach, but you need to enumerate each column separately &#8212; so filtering doesn&#8217;t work.  That would be:select type,       avg(case when seqnum_1 = 1 then score1 end) as avg_1,       avg(case when seqnum_2 = 1 then score1 end) as avg_2from (select t.*,             dense_rank() over (partition by type, score1 is null order by date desc) as seqnum_1,             dense_rank() over (partition by type, score2 is null order by date desc) as seqnum_2      from t     ) tgroup by type;Note:  This includes the NULL values in the averages.  There is no harm in that, because they don&#8217;t affect the results.  You could also express this as:select type,       avg(case when seqnum_1 = 1 then score1 end) as avg_1,       avg(case when seqnum_2 = 1 then score1 end) as avg_2from (select t.*,             dense_rank() over (partition by type order by score1 is not null desc, date desc) as seqnum_1,             dense_rank() over (partition by type order by score2 is not null desc, date desc) as seqnum_2      from t     ) tgroup by type;

Advertisement

Answer