I have 2 tables as below:
Table 1:
UserID DateEntered Status 1 29/03/2021 a 1 30/03/2021 b 1 31/03/2021 a 2 01/04/2021 a 3 02/04/2021 c 2 03/04/2021 c 3 04/04/2021 a 4 05/04/2021 b 4 06/04/2021 a
Table 2 : A simple calendar table
Required output: For every date I want to return the latest status on or before it for every userID.
UserID DateEntered Status Calendar_date 1 29/03/2021 a 25/03/2021 1 29/03/2021 a 26/03/2021 1 29/03/2021 a 27/03/2021 1 29/03/2021 a 28/03/2021 1 29/03/2021 a 29/03/2021 1 30/03/2021 b 30/03/2021 1 31/03/2021 a 31/03/2021 1 31/03/2021 a 01/04/2021 1 31/03/2021 a 02/04/2021 1 31/03/2021 a 03/04/2021
Similarly for other calendar dates and other userIds.
Anyone has a clue how to do it? Thanks in advance!
Advertisement
Answer
For every date I want to return the latest status on or before it for every userID.
In standard SQL, you can generate the rows using cross join and then use a join. The key idea is to use lead() for the join:
select u.userId, c.calendar_date, t1.status, t1.dateEntered
from (select distinct userId from table1) u cross join
table2 c left join
(select t1.*,
lead(DateEntered) over (partition by userId order by DateEntered) as next_DateEntered
from table1 t1
) t1
on u.userId = t1.userId and
c.calendar_date >= t1.DateEntered and
(c.calendar_date < next_DateEntered or next_DateEntered is null);
In SQL Server, you can also express this using outer apply:
select u.userId, c.calendar_date, t1.status, t1.dateEntered
from (select distinct userId from table1) u cross join
table2 c outer apply
(select top (1) t1.*
from table1 t1
where t1.userId = u.userId and
t1.DateEntered <= c.calender_Date
order by t1.DateEntered desc
) t1;
Here is a db<>fiddle.