Skip to content
Advertisement

Is it possible to look at two consecutive rows and determine the difference in time between the two using SQL?

I am relatively new to SQL, so please bear with me! I am trying to see how many customers make a purchase after being dormant for two years. Relevant fields include cust_id and purchase_date (there can be several observations for the same cust_id but with different dates). I am using Redshift for my SQL scripts.

I realize I cannot put the same thing in for the DATEDIFF parameters (it just doesn’t make any sense), but I am unsure what else to do.

SELECT  *
FROM tickets t
     LEFT JOIN d_customer c
     ON c.cust_id = t.cust_id
WHERE     DATEDIFF(year, t.purchase_date, t.purchase_date) between 0 and 2
ORDER BY t.cust_id, t.purchase_date
;

Advertisement

Answer

I think you want lag(). To get the relevant tickets:

SELECT t.*
FROM (SELECT t.*,
             LAG(purchase_date) OVER (PARTITION BY cust_id ORDER BY purchase_date) as prev_pd
      FROM tickets t
     ) t
WHERE prev_pd < purchase_date - interval '2 year';

If you want the number of customers, use count(distinct):

SELECT COUNT(DISTINCT cust_id)
FROM (SELECT t.*,
             LAG(purchase_date) OVER (PARTITION BY cust_id ORDER BY purchase_date) as prev_pd
      FROM tickets t
     ) t
WHERE prev_pd < purchase_date - interval '2 year';

Note that these do not use DATEDIFF(). This counts the number of boundaries between two date values. So, 2018-12-31 and 2019-01-01 have a difference of 1 year.

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement