Skip to content
Advertisement

Query to find active days per year to find revenue per user per year

I have 2 dimension tables and 1 fact table as follows:

user_dim

user_id user_name user_joining_date
1 Steve 2013-01-04
2 Adam 2012-11-01
3 John 2013-05-05
4 Tony 2012-01-01
5 Dan 2010-01-01
6 Alex 2019-01-01
7 Kim 2019-01-01

bundle_dim

bundle_id bundle_name bundle_type bundle_cost_per_day
101 movies and TV prime 5.5
102 TV and sports prime 6.5
103 Cooking prime 7
104 Sports and news prime 5
105 kids movie extra 2
106 kids educative extra 3.5
107 spanish news extra 2.5
108 Spanish TV and sports extra 3.5
109 Travel extra 2

plans_fact

user_id bundle_id bundle_start_date bundle_end_date
1 101 2019-10-10 2020-10-10
2 107 2020-01-15 (null)
2 106 2020-01-15 2020-12-31
2 101 2020-01-15 (null)
2 103 2020-01-15 2020-02-15
1 101 2020-10-11 (null)
1 107 2019-10-10 2020-10-10
1 105 2019-10-10 2020-10-10
4 101 2021-01-01 2021-02-01
3 104 2020-02-17 2020-03-17
2 108 2020-01-15 (null)
4 102 2021-01-01 (null)
4 103 2021-01-01 (null)
4 108 2021-01-01 (null)
5 103 2020-01-15 (null)
5 101 2020-01-15 2020-02-15
6 101 2021-01-01 2021-01-17
6 101 2021-01-20 (null)
6 108 2021-01-01 (null)
7 104 2020-02-17 (null)
7 103 2020-01-17 2020-01-18
1 102 2020-12-11 (null)
2 106 2021-01-01 (null)
7 107 2020-01-15 (null)

note: NULL bundle_end_date refers to active subscription.

user active days can be calculated as: bundle_end_date - bundle_start_date (for the given bundle) total revenue per user could be calculated as : total no. of active days * bundle rate per day

I am looking to write a query to find revenue generated per user per year.

Here is what I have for the overall revenue per user:

select pf.user_id
    , sum(datediff(day, pf.bundle_start_date, coalesce(pf.bundle_end_date, getdate())) * bd.price_per_day) total_cost_per_bundle
from plans_fact pf
inner join bundle_dim bd on bd.bundle_id = pf.bundle_id
group by pf.user_id
order by pf.user_id;

Advertisement

Answer

You need a ‘year’ table to help parse out each multi-year spanning row into it’s seperate years. For each year, you need to also recalculate the start and end dates. That’s what I do in the yearParsed cte in the code below. I hard code the years into the join statement that creates y. You probably will do it different but however you get those values will work.

After that, pretty much sum as you did before, just adding the year column to your grouping.

Aside from that, all I did was move the null coalesce logic to the cte to make the overall logic simpler.

with yearParsed as (

    select      pf.*,
                y.year,
                startDt = iif(pf.bundle_start_date > y.startDt, pf.bundle_start_date, y.startDt),
                endDt = iif(ap.bundle_end_date < y.endDt, ap.bundle_end_date, y.endDt)
    from        plans_fact pf
    cross apply (select bundle_end_date = isnull(pf.bundle_end_date, getdate())) ap
    join        (values 
                    (2019, '2019-01-01', '2019-12-31'), 
                    (2020, '2020-01-01', '2020-12-31'), 
                    (2021, '2021-01-01', '2021-12-31')
                ) y (year, startDt, endDt)
                    on pf.bundle_start_date <= y.endDt
                    and ap.bundle_end_date >= y.startDt

) 

select      yp.user_id, 
            yp.year,
            total_cost_per_bundle = sum(datediff(day, yp.startDt, yp.endDt) * bd.bundle_cost_per_day) 
from        yearParsed yp
join        bundle_dim bd on bd.bundle_id = yp.bundle_id
group by    yp.user_id,
            yp.year
order by    yp.user_id,
            yp.year;

Now, if this is common, you should probably create a base-table for your ‘year’ table. But if it’s not common, but for this report you don’t want to have to keep coming back to hard-code the year information into the y table, you can do this:

declare @yearTable table (
    year int,
    startDt char(10),
    endDt char(10)
);

with y as (

    select      year = year(min(pf.bundle_start_date))
    from        @plans_fact pf

    union all
    select      year + 1
    from        y 
    where       year < year(getdate())

)

insert      @yearTable
select      year, 
            startDt = convert(char(4),year) + '-01-01',
            endDt = convert(char(4),year) + '-12-31'
from        y;

and it will create the appropriate years for you. But you can see why creating a base table may be preferred if you have this or a similar need often.

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement