Skip to content
Advertisement

How to pull a list of all visitor_ids that generated more than $500 combined in their first two sessions in the month of January 2020?

Tables:

  1. Sessions

    • session_ts
    • visitor_id
    • vertical
    • session_id
  2. Transactions

    • session_ts
    • session_id
    • rev_bucket
    • revenue

Currently have the following query (using SQLite):

SELECT
  s.visitor_id,
  sub.session_id,
  month,
  year,
  total_rev,
  CASE 
    WHEN (row_num IN (1,2) >= total_rev >= 500) THEN 'Yes'
    ELSE 'No' END AS High_Value_Transactions,
  sub.row_num
FROM
  sessions s
JOIN
  (
    SELECT
    s.visitor_id,
    t.session_id,
    strftime('%m',t.session_ts) as month,
    strftime('%Y',t.session_ts) as year,
    SUM(t.revenue) as total_rev,
    row_number() OVER(PARTITION BY s.visitor_id ORDER BY s.session_ts) as row_num
  FROM
    Transactions t
  JOIN
    sessions s
  ON
    s.session_id = t.session_id
  WHERE strftime('%m',t.session_ts) = '01'
  AND strftime('%Y',t.session_ts) = '2020'
  GROUP BY 1,2
) sub
ON
  s.session_id = sub.session_id
WHERE sub.row_num IN (1,2)
ORDER BY 1

I’m having trouble identifying the first two sessions that combine for $500. Open to any feedback and simplifying of query. Thanks!

Advertisement

Answer

You can use window functions and aggregation:

select visitor_id, sum(t.revenue) total_revenue
from (
    select 
        s.visitor_id, 
        t.revenue, 
        row_number() over(partition by s.visitor_id order by t.session_ts) rn
    from transactions t
    inner join sessions s on s.session_id = t.session_id
    where t.session_ts >= '2020-01-01' and t.session_ts < '2020-02-01'
) t
where rn <= 2
group by visitor_id
having sum(t.revenue) >= 500

The subquery joins the two tables, filters on the target month (note that using half-open interval predicates is more efficient than applying date functions on the date column), and ranks each row within groups of visits of the same customer.

Then, the outer query filters on the first two visits per visitor, aggregates by visitor, computes the corresponding revenue, and filters it with a having clause.

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement