Skip to content
Advertisement

How do I find the maximum depending on multiple subranges within one column

I have two tables: Orders and Customers. Orders lists all bought items (policy_id) and Customers information such as their age. The two tables are linked via the customer_id. Now I want to find the most popular items within 3 age groups (18-30, 31-60, >60). How do I approach this problem?

The following gives me a list of all items and their popularity split into the age groups but I only want the most popular ones per age group.

SELECT o.policy_id, count(o.policy_id) as ntotal, CASE  
    WHEN c.Age BETWEEN 18 and 30 then '18 - 30'
    WHEN c.Age BETWEEN 31 and 60 then '31 - 60'
    else 'above 60' end as RANGErrr    
FROM Orders o INNER JOIN Customer c ON (o.customer_id=c.customer_id)
GROUP BY RANGErrr, o.policy_id

Do I need to add something or am I on the completely wrong track?

Advertisement

Answer

You can use window functions for this. You can do so right along with the aggregation. For the top policy_id per age range:

SELECT co.*
FROM (SELECT co.*,
              ROW_NUMBER() OVER (PARTITION BY RANGErrr ORDER BY ntotal DESC) as seqnum    
      FROM (SELECT o.policy_id, count(*) as ntotal,
                   (CASE WHEN c.Age BETWEEN 18 and 30 then '18 - 30'
                         WHEN c.Age BETWEEN 31 and 60 then '31 - 60'
                         else 'above 60'
                    END) as RANGErrr
            FROM Orders o INNER JOIN
                 Customer c
                 ON o.customer_id = c.customer_id
            GROUP BY RANGErrr, o.policy_id
           ) co
     ) co
WHERE seqnum = 1;

If you want a different number, just change the WHERE clause.

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement