I would like to calculate the time delay between several customer purchases. However, each purchase is saved in an individual row. The data set looks similar to the following:
customer | order_id | purchase_date | product | sequencen| ... | customer1 | 1247857 | 2020-01-30 | ProdA, ProdB | 1 | ... | customer2 | 4454874 | 2020-02-07 | ProdA | 1 | ... | customer3 | 3424556 | 2020-02-28 | ProdA | 1 | ... | customer4 | 5678889 | 2020-03-14 | ProdB | 1 | ... | customer3 | 5853778 | 2020-03-22 | ProdA, ProdB | 2 | ... | customer4 | 7578345 | 2020-03-30 | ProdA, ProdB | 2 | ... | customer2 | 4892978 | 2020-05-10 | ProdA | 2 | ... | customer5 | 4834789 | 2020-07-05 | ProdA, ProdB | 1 | ... | customer5 | 9846726 | 2020-07-27 | ProdB | 2 | ... | customer1 | 1774783 | 2020-12-12 | ProdB | 2 | ... |
Per customer, I would like to end up with a table that calculates the time-difference (in days) between a certain purchase and the purchase that came before. Basically, I would like to know the time delay (latency) between a customers first and second purchase, second and third purchase, and so on. The result should look like the following:
customer | order_id | purchase_date | product | sequencen| ... | purchase_latency customer1 | 1247857 | 2020-01-30 | ProdA, ProdB | 1 | ... | customer1 | 1774783 | 2020-12-12 | ProdB | 2 | ... | 317 customer2 | 4454874 | 2020-02-07 | ProdA | 1 | ... | customer2 | 4892978 | 2020-05-10 | ProdA | 2 | ... | 93 customer3 | 3424556 | 2020-02-28 | ProdA | 1 | ... | customer3 | 5853778 | 2020-03-22 | ProdA, ProdB | 2 | ... | 23 customer4 | 5678889 | 2020-03-14 | ProdB | 1 | ... | customer4 | 7578345 | 2020-03-30 | ProdA, ProdB | 2 | ... | 16 customer5 | 4834789 | 2020-07-05 | ProdA, ProdB | 1 | ... | customer5 | 9846726 | 2020-07-27 | ProdB | 2 | ... | 22
I am struggling to add the purchase_latency calculation to my current query, as the calculation would require me to do a calculation over rows. Any ideas how I could add this to my current query?:
SELECT order_id max(customer) as customer, max(purchase_date) as purchase_date, STRING_AGG(product, ",") as product, ..., FROM SELECT( od.order_number as order_id, od.customer_email as customer, od.order_date as purchase_date dd.sku as product, ROW_NUMBER() OVER (PARTITION BY od.customer_email ORDER BY od.order_date ) as sequencen FROM orders_data od JOIN detail_data dd ON od.order_number = dd.order_number where od.price> 0 AND od.sku in ("ProdA","ProdB")) GROUP BY order_id
Advertisement
Answer
Did you try row navigation functions like LAG?
WITH finishers AS (SELECT 'Sophia Liu' as name, TIMESTAMP '2016-10-18 2:51:45' as finish_time, 'F30-34' as division UNION ALL SELECT 'Lisa Stelzner', TIMESTAMP '2016-10-18 2:54:11', 'F35-39' UNION ALL SELECT 'Nikki Leith', TIMESTAMP '2016-10-18 2:59:01', 'F30-34' UNION ALL SELECT 'Lauren Matthews', TIMESTAMP '2016-10-18 3:01:17', 'F35-39' UNION ALL SELECT 'Desiree Berry', TIMESTAMP '2016-10-18 3:05:42', 'F35-39' UNION ALL SELECT 'Suzy Slane', TIMESTAMP '2016-10-18 3:06:24', 'F35-39' UNION ALL SELECT 'Jen Edwards', TIMESTAMP '2016-10-18 3:06:36', 'F30-34' UNION ALL SELECT 'Meghan Lederer', TIMESTAMP '2016-10-18 3:07:41', 'F30-34' UNION ALL SELECT 'Carly Forte', TIMESTAMP '2016-10-18 3:08:58', 'F25-29' UNION ALL SELECT 'Lauren Reasoner', TIMESTAMP '2016-10-18 3:10:14', 'F30-34') SELECT name, finish_time, division, LAG(name) OVER (PARTITION BY division ORDER BY finish_time ASC) AS preceding_runner FROM finishers; +-----------------+-------------+----------+------------------+ | name | finish_time | division | preceding_runner | +-----------------+-------------+----------+------------------+ | Carly Forte | 03:08:58 | F25-29 | NULL | | Sophia Liu | 02:51:45 | F30-34 | NULL | | Nikki Leith | 02:59:01 | F30-34 | Sophia Liu | | Jen Edwards | 03:06:36 | F30-34 | Nikki Leith | | Meghan Lederer | 03:07:41 | F30-34 | Jen Edwards | | Lauren Reasoner | 03:10:14 | F30-34 | Meghan Lederer | | Lisa Stelzner | 02:54:11 | F35-39 | NULL | | Lauren Matthews | 03:01:17 | F35-39 | Lisa Stelzner | | Desiree Berry | 03:05:42 | F35-39 | Lauren Matthews | | Suzy Slane | 03:06:24 | F35-39 | Desiree Berry | +-----------------+-------------+----------+------------------+