We ran a promotion where users can receive their first subscription order free. Price = $0.00 when a user uses the promo. I am interested in the data from Example A.
Example A – User 50 started with the promo and continued for two months
order_id  user_id    price    created_at
    1        50       0.00    2018-01-15
    5        50      20.00    2018-02-15
    9        50      20.00    2018-03-15
Example B – User 100 was already an active subscriber who cancelled his account and reactivated with the promo, I do not wish to count him
order_id  user_id    price    created_at
    2        100      20.00    2018-01-16
    3        100       0.00    2018-01-17
    7        100      20.00    2018-02-17
–Here is my query–
This returns all users who have multiple orders
WHERE at least one of their orders has a price = 0.00
-This dataset returns example A and example B
–My question–
Most of this data is correct (Example A) but a handful of them I want to omit because they are skewing my data (Example B). I want to remove Example B users.
I want to remove people who’s first order was not the promo.
How can I request that their FIRST order had a price = 0.00? I was thinking something with min(created_at)?
Advertisement
Answer
You can get the time of the first order using:
select user_id, min(created_at) as min_ca from t group by user_id;
Next, you can get the price of the first order using:
select oi.*
from order_items oi join
     (select user_id, min(created_at) as min_ca
      from order_items oi
      group by user_id
     ) ooi
     on oi.user_id = ooi.user_id and oi.created_at = ooi.min_ca
where oi.price = 0.00;
Then you can get all records, using join, in, or exists;
select oi.*
from order_items oi join
     order_items oi1
     on oi.user_id = oi1.user_id join
     (select user_id, min(created_at) as min_ca
      from order_items oi
      group by user_id
     ) u1
     on oi1.user_id = u1.user_id and oi1.created_at = u1.min_ca
where oi1.price = 0.00;
