Skip to content
Advertisement

SQL query to get Most Frequently purchased and Most Recently purchased quantity for a customer-product pair

I have a table Customer_Order_Data with these columns:

  1. customer_id
  2. order_id
  3. product_id
  4. quantity
  5. order_date

For all the dataset available I want to find out the most frequently purchased quantity and most recently purchased quantity for a customer-product pair.

Let’s say I have the following data –

Order_id Customer_id product_id Quantity order_date
00001 ABCD B019 7 2020-01-01
00002 ABCD B019 6 2020-05-23
00003 EFGH B018 8 2020-01-12
00004 ABCD B019 7 2020-02-14
00005 EFGH B018 6 2020-04-18
00006 ABCD B019 7 2020-04-19
00007 EFGH B018 8 2020-03-12

I want to group the rows based on customer_id.

Result expected –

Customer_id product_id Most frequently Purachsed Most Recently Purchased
ABCD B019 7 6
EFGH B018 8 6

I am unable to add dummy columns to vend out this type of view.

PostgreSQL 8.0.26

Advertisement

Answer

You are describing the statistical mode – and a top-1-per-group problem. A few databases have a built-in aggregate function to compute it, but a generic approach is:

select * 
from (
    select customer_id, product_id, quantity, count(*) as cnt,
        rank() over(partition by customer_id, product_id order by count(*) desc) rn1,
        max(case when rn = 1 then quantity end) as most_recently_purchased
    from (
        select t.*,
            row_number() over(partition by customer_id, product_id order by order_date desc) rn
        from mytable t
    ) t
    group by customer_id, product_id, quantity
) t
where rn1 = 1
User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement