I have 2 tables as below:
Table 1:
x
UserID DateEntered Status
1 29/03/2021 a
1 30/03/2021 b
1 31/03/2021 a
2 01/04/2021 a
3 02/04/2021 c
2 03/04/2021 c
3 04/04/2021 a
4 05/04/2021 b
4 06/04/2021 a
Table 2 : A simple calendar table
Required output: For every date I want to return the latest status on or before it for every userID.
UserID DateEntered Status Calendar_date
1 29/03/2021 a 25/03/2021
1 29/03/2021 a 26/03/2021
1 29/03/2021 a 27/03/2021
1 29/03/2021 a 28/03/2021
1 29/03/2021 a 29/03/2021
1 30/03/2021 b 30/03/2021
1 31/03/2021 a 31/03/2021
1 31/03/2021 a 01/04/2021
1 31/03/2021 a 02/04/2021
1 31/03/2021 a 03/04/2021
Similarly for other calendar dates and other userIds.
Anyone has a clue how to do it? Thanks in advance!
Advertisement
Answer
For every date I want to return the latest status on or before it for every userID.
In standard SQL, you can generate the rows using cross join
and then use a join. The key idea is to use lead()
for the join
:
select u.userId, c.calendar_date, t1.status, t1.dateEntered
from (select distinct userId from table1) u cross join
table2 c left join
(select t1.*,
lead(DateEntered) over (partition by userId order by DateEntered) as next_DateEntered
from table1 t1
) t1
on u.userId = t1.userId and
c.calendar_date >= t1.DateEntered and
(c.calendar_date < next_DateEntered or next_DateEntered is null);
In SQL Server, you can also express this using outer apply
:
select u.userId, c.calendar_date, t1.status, t1.dateEntered
from (select distinct userId from table1) u cross join
table2 c outer apply
(select top (1) t1.*
from table1 t1
where t1.userId = u.userId and
t1.DateEntered <= c.calender_Date
order by t1.DateEntered desc
) t1;
Here is a db<>fiddle.