I have a table similar to below:
x
+-------------------------+
¦ ID ¦ Date ¦ Balance ¦
¦----+----------+---------¦
¦ A ¦ 20200620 ¦ 150 ¦
¦ A ¦ 20200621 ¦ -130 ¦
¦ A ¦ 20200621 ¦ -140 ¦
¦ A ¦ 20200621 ¦ -200 ¦
¦ A ¦ 20200622 ¦ 200 ¦
¦ A ¦ 20200622 ¦ 300 ¦
¦ B ¦ 20200621 ¦ 350 ¦
¦ B ¦ 20200621 ¦ 400 ¦
¦ B ¦ 20200621 ¦ -150 ¦
¦ B ¦ 20200622 ¦ -200 ¦
¦ B ¦ 20200622 ¦ -300 ¦
¦ B ¦ 20200623 ¦ -400 ¦
¦ B ¦ 20200623 ¦ -500 ¦
+-------------------------+
I need to calculate the number of consecutive days where “Balance <0” for reach ID and for each distinct Date (including the date itself in the calculation). Each Id might have several balances in a given date either positive amount or negative. Even if one balance amount in a given day is negative, the query should take that day into account. The output result should be similar to the table below:
+--------------------------------------------+
¦ ID ¦ Date ¦ Number_of_Consecutive_Days ¦
¦----+----------+----------------------------¦
¦ A ¦ 20200620 ¦ Null ¦
¦----+----------+----------------------------¦
¦ A ¦ 20200621 ¦ 1 ¦
¦----+----------+----------------------------¦
¦ A ¦ 20200622 ¦ 1 ¦
¦----+----------+----------------------------¦
¦ B ¦ 20200621 ¦ Null ¦
¦----+----------+----------------------------¦
¦ B ¦ 20200622 ¦ 2 ¦
¦----+----------+----------------------------¦
¦ B ¦ 20200623 ¦ 3 ¦
+--------------------------------------------+
Could you please suggest me a way to calculate that? It is highly appreciated.
Advertisement
Answer
Note that any gaps between dates will be treated as consecutive days.
with data as (
select id, date,
case when min(balance) >= 0 then 0 else 1 end as tally,
sum(case when min(balance) >= 0 then 1 else 0 end)
over (partition by id order by date) as grp
from t
group by id, date
)
select id, date,
sum(tally) over (partition by id, grp, tally order by date) as running_days
from data
order by id, date;
To treat missing dates as nonconsecutive try:
sum(case when min(balance) >= 0 then 1 else 0 end)
over (partition by id order by date) +
datediff(day, min(date) over (partition by id), date) -
row_number() over (partition by id order by date) + 1 as grp