I have 2 tables as below:
Table 1:
UserID DateEntered Status 1 29/03/2021 a 1 30/03/2021 b 1 31/03/2021 a 2 01/04/2021 a 3 02/04/2021 c 2 03/04/2021 c 3 04/04/2021 a 4 05/04/2021 b 4 06/04/2021 a
Table 2 : A simple calendar table
Required output: For every date I want to return the latest status on or before it for every userID.
UserID DateEntered Status Calendar_date 1 29/03/2021 a 25/03/2021 1 29/03/2021 a 26/03/2021 1 29/03/2021 a 27/03/2021 1 29/03/2021 a 28/03/2021 1 29/03/2021 a 29/03/2021 1 30/03/2021 b 30/03/2021 1 31/03/2021 a 31/03/2021 1 31/03/2021 a 01/04/2021 1 31/03/2021 a 02/04/2021 1 31/03/2021 a 03/04/2021
Similarly for other calendar dates and other userIds.
Anyone has a clue how to do it? Thanks in advance!
Advertisement
Answer
For every date I want to return the latest status on or before it for every userID.
In standard SQL, you can generate the rows using cross join
and then use a join. The key idea is to use lead()
for the join
:
select u.userId, c.calendar_date, t1.status, t1.dateEntered from (select distinct userId from table1) u cross join table2 c left join (select t1.*, lead(DateEntered) over (partition by userId order by DateEntered) as next_DateEntered from table1 t1 ) t1 on u.userId = t1.userId and c.calendar_date >= t1.DateEntered and (c.calendar_date < next_DateEntered or next_DateEntered is null);
In SQL Server, you can also express this using outer apply
:
select u.userId, c.calendar_date, t1.status, t1.dateEntered from (select distinct userId from table1) u cross join table2 c outer apply (select top (1) t1.* from table1 t1 where t1.userId = u.userId and t1.DateEntered <= c.calender_Date order by t1.DateEntered desc ) t1;
Here is a db<>fiddle.