Skip to content
Advertisement

Flagging amount thresholds reached per user

Story: Two conditions:

  1. I need to calculate the amount of days it takes a user to spend over 20, 50 and 100$ based on his register_date
  2. If someone’s first purchase is > 100$ then the amount of days will be the same for the 3 thresholds.

I was able to get the logic for flagging the thresholds but I’m stuck for users’ first purchase that crosses one or two thresholds.

Objective: I need to calculate the amount of days it takes them to reach over 20, 50 and 100$.

Current query: I might have to change the whole logic to satisfy the second condition but yea, i’m stuck. The following code flags correctly the thresholds being reached.

select
    user_id, register_date
    ,total
    ,order_date
    ,cumulative_sum
    ,threshold
    ,case 
        when threshold+LAG(threshold,1,0) over (partition by user_id order by order_date)=1 then 1
        when threshold+LAG(threshold,1,0) over (partition by user_id order by order_date)=3 then 2
        when threshold+LAG(threshold,1,0) over (partition by user_id order by order_date)=5 then 3
        else 0 end as flag
from (
    select
        user_id, register_date
        ,total
        ,order_date
        ,cumulative_sum
        ,case 
            When cumulative_sum>=100 then 3 
            When cumulative_sum>=50 then 2 
            When cumulative_sum>=20 then 1 else 0 end as threshold
    from (
        select 
            user_id, register_date
            ,(price*quantity) as total
            ,order_date 
            ,SUM(price*quantity) over (partition by user_id order by order_date asc) as cumulative_sum
        from #t1
    ) as base1
) as base2

Data:

CREATE TABLE #t1 (user_id int, price int, quantity int, order_date datetime,register_date datetime)
insert into #t1 values
(1,10,1,'2019-01-01 00:00:00.000','2019-01-01 00:00:00.000'),
(1,15,1,'2019-01-02 00:00:00.000','2019-01-01 00:00:00.000'),
(1,30,1,'2019-01-03 00:00:00.000','2019-01-01 00:00:00.000'),
(1,100,1,'2019-01-04 00:00:00.000','2019-01-01 00:00:00.000'),
(2,60,1,'2019-01-02 00:00:00.000','2019-01-01 00:00:00.000'),
(3,150,1,'2019-01-03 00:00:00.000','2019-01-01 00:00:00.000'),
(4,10,1,'2019-01-04 00:00:00.000','2019-01-01 00:00:00.000'),
(4,20,1,'2019-01-05 00:00:00.000','2019-01-01 00:00:00.000'),
(4,50,2,'2019-01-06 00:00:00.000','2019-01-01 00:00:00.000')

Advertisement

Answer

Just use conditional aggregation and datediff():

select user_id,
       datediff(day, min(register_date),
                min(case when cumulative_sum >= 20 then order_date end)
               ) as days_to_20,
       datediff(day, min(register_date),
                min(case when cumulative_sum >= 50 then order_date end)
               ) as days_to_50,
       datediff(day, min(register_date),
                min(case when cumulative_sum >= 100 then order_date end)
               ) as days_to_100
from (select t.*,
             sum(price * quantity) over (partition by user_id order by order_date asc) as cumulative_sum
      from #t1 t
     ) t
group by user_id;
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement