Skip to content
Advertisement

Apply SUM( where date between date1 and date2)

My table is currently looking like this:

I’m trying to sum values of ‘ON_Prepaid’ over the course of 7 days, let’s say from ‘2020-01-01’ to ‘2020-01-07’. Here is what I’ve tried

Things should be fine if the dates are continuous; however, there are some missing days in ‘Pre_Date’ (you know banks don’t work on Sundays, etc.).

So I’m trying to work on something like

something like that. So if per se there’s no record on 2020-01-05, the result should only sum the dates on the 1,2,3,4,6,7 of Jan 2020, instead of 1,2,3,4,6,7,8 (8 because of “rows 7 following”). Or for example I have missing records over the span of 30 days or something, then all those 30 should be summed as 0s. So 45 days should return only the value of 15 days. I’ve tried looking up all over the forum and the answers did not suffice. Can you guys please help me out? Or link me to a thread which the problem had already been solved.

Thank you so much.

Advertisement

Answer

Things should be fine if the dates are continuous

Then make them continuous. Left join your real data (grouped up so it is one row per day) onto your calendar table (make one, or use a recursive cte to generate you a list of 360 dates from X hence) and your query will work out

CAL is just a table with a single column of dates, one per day, no time, extending for n thousand days into the past/future

Wish to note that months have variable number of days so 6M is a bit of a misnomer.. might be better to call the month ones 180D, 90D etc

Also want to point out that your query performs a per row division of your data into into groups. If you want to perform sums up to 180 days after the date of the row you need to pull a year’s worth of data so that on row 180(June) you have the December data available to sum (dec being 6 months from June)

If you then want to restrict your query to only showing up to June (but including data summed from 6 months after June) you need to wrap it all again in a sub query. You cannot “where between jan and jun” in the query that does the sum over because where clauses are done before window clauses (doing so will remove the dec data before it is summed)

Some other databases make this easier, Oracle and Postgres spring to mind; they can perform sum in a range where the other rows values are within some distance of the current row’s values. SQL server only usefully supports distancing based on a row’s index rather than its values (the distancing-based-on-values support is limited to “rows that have the same value”, rather than “rows that have values n higher or lower than the current row”). I suppose the requirement could be met with a cross apply, or a coordinated sub in the select, though I’d be careful to check the performance..

User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement