I want to find the most recent purchase per customer for each date (a customer might make multiple purchases on a given date and I want the most recent for that date). My dataframe looks something like the following:
Customer_id | Date | Payment | Type 3 '2019-01-28 00:45:00' 128.93 Credit 2 '2019-01-26 01:00:00' 54.36 Debit 3 '2019-01-27 12:00:00' 93.99 Debit 3 '2019-01-28 03:15:00' 164.93 Credit 2 '2019-01-26 17:30:00' 56.74 Credit 2 '2019-01-28 19:15:00' 21.85 Credit
The output of the query should give me something like the following (I return one row per customer/date combination and order by customer/date. I don’t care about the Type – whether it is Credit or Debit):
Customer_id | Date | Payment 2 '2019-01-26' 56.74 2 '2019-01-28' 21.85 3 '2019-01-27' 93.99 3 '2019-01-28' 164.93
Here is my code so far:
SELECT R1.Customer_id, date_trunc('day', R1.Date), R1.Payment FROM Records R1 WHERE R1.Date = (SELECT MAX(R2.Date) FROM Records R2 WHERE R2.Customer_id = R1.Customer_id) GROUP BY Customer_id, date_trunc('day', R1.Date);
The example I gave is a simplification of the data I am working with and some of the filtering conditions, but should capture the main problem I am facing. Not really sure if that is on the right track, but it seems unhappy with the ‘Group by’ when I run the query. I am new to sql and would love help!
Advertisement
Answer
This query:
select Customer_id, Date::date, max(Date) maxdate from tablename group by customer_id, Date::date
gets the most recent date for each customer.
Join it to the main table to get the rows that contain the most recent date for each customer:
select t.Customer_id, t.date, tt.Payment from ( select Customer_id, Date::date date, max(Date) maxdate from tablename group by customer_id, Date::date ) t inner join tablename tt on tt.customer_id = t.customer_id and tt.date = t.maxdate