Basically, I am trying to write a query against a table such like the following. Here we have Plans of either Free, Plus or Premium (in the example below I only show free and plus plans). Is it possible to return records where an ID that was original a ‘Free’ plan has changed to either a ‘plus’ or ‘premium’? This would be caught via a binary condition check column ‘ID Upgraded’
In the case of the Data table below
ID Name Plan Date 1 Sam Free 1/1/2021 3 Amy Free 1/2/2021 4 Tom Free 1/5/2021 3 Amy Plus 1/8/2021 2 Kim Free 1/9/2021 4 Tom Plus 1/10/2021
Output:
ID Name Plan Date. Upgrade? 1 Sam Free 1/1/2021 0 3 Amy Free 1/2/2021 0 4 Tom Free 1/5/2021 0 3 Amy Plus 1/8/2021 1 2 Kim Free 1/9/2021 0 4 Tom Plus 1/10/2021 1
In my attempt to accomplish this I was only able to get as far as ISOLATING user IDs that are repeats in a separate table with the use of a variety of CTEs. While this can kind of work I believe for analysis the output as shown above would be a much cleaner option.
As am writing this request I imagine we need to use some kind of count feature to count if a value repeats then check to see if plan string value has changed?
My logic is as follows:
With data as ( select id, name, plan, date, count(id) as cnt from TABLE ) select * from data where cnt > 1 -- All repeating ids **and plan changes help needed here**
Edit: The method I tried above resulted in columns not being counted by the count function. needs work
Edit: also tried the following but no avail!
Row_Number() Over (Partition By id Order By count(t.id) Asc) As DupCounter,
Advertisement
Answer
You can use LAG
, which is one of window (a.k.a. analytical) functions.
Initial data:
CREATE TEMP TABLE plans(ID integer, Name text, Plan text, Date date); COPY plans FROM STDIN WITH (FORMAT CSV); 1,Sam,Free,1/1/2021 3,Amy,Free,1/2/2021 4,Tom,Free,1/5/2021 3,Amy,Plus,1/8/2021 2,Kim,Free,1/9/2021 4,Tom,Plus,1/10/2021 .
Query:
SELECT ID,Name,Plan,Date, LAG(Plan) OVER(PARTITION BY ID ORDER BY Date) AS PrevPlan FROM plans;
Result:
id | name | plan | date | prevplan ----+------+------+------------+---------- 1 | Sam | Free | 2021-01-01 | NULL 2 | Kim | Free | 2021-09-01 | NULL 3 | Amy | Free | 2021-02-01 | NULL 3 | Amy | Plus | 2021-08-01 | Free 4 | Tom | Free | 2021-05-01 | NULL 4 | Tom | Plus | 2021-10-01 | Free (6 rows)
Query:
SELECT ID,Name,Plan,Date,PrevPlan, CASE WHEN Plan>PrevPlan then 'Upgrade' WHEN Plan<Prevplan then 'Downgrade' END AS flag FROM (SELECT *, LAG(Plan) OVER(PARTITION BY ID ORDER BY Date) AS PrevPlan FROM plans) AS p;
Result:
id | name | plan | date | prevplan | flag ----+------+------+------------+----------+--------- 1 | Sam | Free | 2021-01-01 | NULL | NULL 2 | Kim | Free | 2021-09-01 | NULL | NULL 3 | Amy | Free | 2021-02-01 | NULL | NULL 3 | Amy | Plus | 2021-08-01 | Free | Upgrade 4 | Tom | Free | 2021-05-01 | NULL | NULL 4 | Tom | Plus | 2021-10-01 | Free | Upgrade (6 rows)
To satisfy your exact requirement and flag only when a ‘Free’ changes to either a ‘plus’ or ‘premium’, use this:
SELECT ID,Name,Plan,Date,PrevPlan, CASE WHEN PrevPlan='Free' THEN CASE WHEN Plan IN ('Plus', 'Premium') THEN 'Upgrade' END WHEN PrevPlan IN('Plus', 'Premium') THEN CASE WHEN Plan='Free' THEN 'Downgrade' END END as flag FROM ( SELECT *, LAG(Plan) OVER(PARTITION BY ID ORDER BY Date) AS PrevPlan FROM plans ) AS p;