How to write a single query to run some retrospective aggregation when time window is different for every row?

Question

I am writing some SQL queries to create a dataset for customer churn predictions based on historical service data. Some of the services date back years ago. Small percentage of them churned at some time in the past while others ended up getting renewed. Some of the attributes are based on aggregation of the s…

Accepted Answer

Here’s a rough example you might be able to follow. I use a WITH clause to provide the set of parameters to apply, but you could also store this in a base table to use in the JOIN.If you have an algorithmic way to generate the sets of parameters, that could be used in a derived table or CTE term to provide as many sets as you wish. We could even use recursive behavior, if your database supports it.A list of date ranges and list of accounts could easily be used to generate sets of parameters.The data:SELECT * FROM ServicesAll;+------------+------------+------------------+-----------------+-------------+-------+| account_no | service_no | InstallationDate | BillingStopDate | ExtractDate | Churn |+------------+------------+------------------+-----------------+-------------+-------+| A1 | S1 | 2018-01-01 | 2019-03-01 | 2019-02-01 | 1 || A2 | S2 | 2016-05-01 | 2020-04-01 | 2020-03-01 | 0 || A1 | S3 | 2018-07-01 | 2019-07-01 | 2019-06-01 | 0 || A1 | S4 | 2016-11-01 | 2021-03-01 | 2021-02-01 | 0 || A1 | S5 | 2018-01-01 | 2019-01-01 | 2018-12-01 | 1 || A2 | S6 | 2019-01-01 | 2021-08-01 | 2021-07-01 | 1 |+------------+------------+------------------+-----------------+-------------+-------+Here’s an approach which generates lists of accounts and date ranges to derive the sets of parameters to apply, and then applies them all in the same query:WITH RECURSIVE accounts (account_no) AS ( SELECT DISTINCT account_no FROM ServicesAll) , startDate (start_date) AS ( SELECT current_date - INTERVAL '5' YEAR ) , ranges (start_date, end_date, n) AS ( SELECT start_date, start_date + INTERVAL '6' MONTH, 1 FROM startDate UNION ALL SELECT start_date + INTERVAL '6' MONTH , end_date + INTERVAL '6' MONTH, n+1 FROM ranges WHERE n < 8 ) , args (p0, p1, p2) AS ( SELECT account_no, start_date, end_date FROM accounts, ranges )SELECT p0, p1, p2, COUNT(DISTINCT service_no) FROM ServicesAll, args WHERE account_no = args.p0 AND InstallationDate < args.p1 AND (BillingStopDate IS NULL OR BillingStopDate >= args.p2) GROUP BY p0, p1, p2;The result:+------+------------+------------+----------------------------+| p0 | p1 | p2 | COUNT(DISTINCT service_no) |+------+------------+------------+----------------------------+| A1 | 2017-02-17 | 2017-08-17 | 1 || A1 | 2017-08-17 | 2018-02-17 | 1 || A1 | 2018-02-17 | 2018-08-17 | 3 || A1 | 2018-08-17 | 2019-02-17 | 3 || A1 | 2019-02-17 | 2019-08-17 | 1 || A1 | 2019-08-17 | 2020-02-17 | 1 || A1 | 2020-02-17 | 2020-08-17 | 1 || A2 | 2016-08-17 | 2017-02-17 | 1 || A2 | 2017-02-17 | 2017-08-17 | 1 || A2 | 2017-08-17 | 2018-02-17 | 1 || A2 | 2018-02-17 | 2018-08-17 | 1 || A2 | 2018-08-17 | 2019-02-17 | 1 || A2 | 2019-02-17 | 2019-08-17 | 2 || A2 | 2019-08-17 | 2020-02-17 | 2 || A2 | 2020-02-17 | 2020-08-17 | 1 |+------+------------+------------+----------------------------+The following SQL provide sets of parameters in the args CTE term, and then JOINs with that CTE term in the final query expression. The key is the GROUP BY clause which produces a result row for each set of parameters.WITH args (p0, p1, p2) AS ( SELECT 'A1', '2018-07-02', '2018-07-02' UNION SELECT 'A1', '2016-11-02', '2016-11-02' UNION SELECT 'A2', '2016-11-02', '2016-11-02' )SELECT p0, p1, p2, COUNT(DISTINCT service_no) FROM ServicesAll, args WHERE account_no = args.p0 AND InstallationDate < args.p1 AND (BillingStopDate IS NULL OR BillingStopDate >= args.p2) GROUP BY p0, p1, p2;Result:+----+------------+------------+----------------------------+| p0 | p1 | p2 | COUNT(DISTINCT service_no) |+----+------------+------------+----------------------------+| A1 | 2016-11-02 | 2016-11-02 | 1 || A1 | 2018-07-02 | 2018-07-02 | 4 || A2 | 2016-11-02 | 2016-11-02 | 1 |+----+------------+------------+----------------------------+

Advertisement

Answer