Skip to content
Advertisement

How can I select the max counted values from that was ‘GROUP BY’ twice?

Sorry for the bad title, I need to improve on how to explain my problem better, obviously. I’m practicing queries on the Adventure Works data in SQL server, and I queried such as:

SELECT a.City, pc.Name, COUNT(pc.Name) AS 'Count'
FROM SalesLT.SalesOrderHeader AS oh
JOIN SalesLT.SalesOrderDetail AS od ON oh.SalesOrderID = oh.SalesOrderID
JOIN SalesLT.Product AS p ON od.ProductID = p.ProductID
JOIN SalesLT.ProductCategory AS pc ON p.ProductCategoryID = pc.ProductCategoryID
JOIN SalesLT.Address AS a ON oh.ShipToAddressID = a.AddressID
GROUP BY a.City, pc.Name
ORDER BY a.City;

Which gives:

City          Name   Count
----         ------  ------
Abingdon    Cleaners    7
Abingdon    Vests      16
Abingdon    Pedals     29
Alhambra    Jersey     44
Alhambra    Vests      16
Auburn  Hydration Packs 7
Auburn  Derailleurs     8

And I’m trying to only get the largest count item, expected output looks like this:

City          Name   Count
----         ------  ------
Abingdon    Pedals     29
Alhambra    Jersey     44
Auburn  Derailleurs     8

The max count for each city. Since I don’t have a sample dataset, so I’m not asking for the exact query, but can you give me some idea of what I should look into? I’ve been trying different ‘group by’s, but I always end up getting this far, but I can never move forward.

Also, the question was “Identify the three most important cities. Show the break down of top-level product category against the city.”, if possible, can you please share how you would approach this problem? I’ve been trying to improve my SQL skills, but I have a hard time writing a complex query. It would be greatly appreciated if you can share a tip to approach complex queries.

Any guidance on how I should approach this problem would be appreciated.

Advertisement

Answer

In this case you want to get the maximum value of count group by city, this can be achieve by using subquery.
here is some pseudo example for you to reference.

with cte as (
 select '1' as id, '1' as val
  union all
 select '2' as id, '1' as val
  union all
 select '2' as id, '2' as val
  union all
 select '3' as id, '1' as val
  union all
 select '3' as id, '1' as val
  union all
 select '3' as id, '1' as val
  union all
 select '3' as id, '2' as val
  union all
 select '3' as id, '2' as val
), 
a as(
select id,val,count(id) as cou 
from cte
group by id,val
)
select * from (
select *,max(cou) over(partition by id) as max_cou from a
) b
where cou = max_cou

the first cte is just pseudo data and second cte a is the part reference OP current query. rest is the solution.

here is the db<>fiddle link.
As I recall sql-server does not accept nested aggregate function, so subquery is a much easier approach.

If anyone have a way cleaner and simpler query I will be happy to see it too 😀

User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement