COUNT with GROUP BY based on most recent rows only

Question

I have a table named user_teams which has the following columns: id: primary key user_id: FK to users table team_id: FK to teams table effective_date: Date I want to have a query that given a set of &#8230;

Accepted Answer

If you want to avoid looking at every user in the table, follow these steps:Find all users that play or played in the requested teams.Find those users&#8217; latest entries.Determine those entries&#8217; teams.Only keep desired teams and count.The query:select team_id, count(*)from(  select    team_id,    row_number() over (partition by user_id order by effective_date desc) as rn  from user_teams  where user_id in  (    select user_id    from user_teams    where team_id in (1,2,3)  )) rankedwhere rn = 1 and team_id in (1,2,3)group by team_idorder by team_id;Indexes:create index idx1 on user_teams (team_id, user_id);create index idx2 on user_teams (user_id, effective_date, team_id);Anyway, working this way makes sense when you have, say, 10000 users with their team history in the table, but a team has just five or ten users. This means working on a small subset of the table data. Once the ratio is less extreme, it may be quicker to simply go through the whole table, i.e. use your own query. This could still benefit from the second index, as it contains all data in the appropriate order (per user -> highest date -> team).

Advertisement

Answer