Skip to content
Advertisement

Top-N By Decade for successive decades (in SQL Server)

I’m trying to get a ranked list of Top 5 (i.e. most common) document titles, grouped by decade, for each of the 6 most recent decades.

The document titles are non-unique. There could be dozens or even hundreds of documents with the same title in any given calendar year.

The following query is as far as I’ve been able to go. It gives me the Top 5 titles, but only for the ‘all others’ Decade.

How can I modify the query to get the Top 5 titles for each of the other decades as well?

Advertisement

Answer

If I followed you correctly, you want to top 5 per decade. If so:

  • you would need to group by decade rather than by calendar year to get the proper counts; it is easier to compute the decade in a subquery so you don’t have to repeat the case expression

  • the rank should be computed over decade partitions rather than per year

  • you can then use that column to filter in an outer query

Consider:

Side notes:

  • don’t use single quotes for identifiers (although SQL Server allows that, single quotes should be reserved for litteral stings, as defined in the SQL standard) – better yet, you can use identifiers that do not require quoting

  • in a multi-table query, always qualify all column names with the table they belong to; I made a few assumptions here

  • unless you have null values in column documentTitle that you don’t want to count in, you can use count(*) instead of count(documentTitle) – this is straight-forward, and more efficient

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement