First time post, but I’ve stumbled across Stackoverflow for years.
A patient might get a particular lab test many many times in a span of a few months. However, to reduce duplication, I only want to capture these results once every 45 days. So if a patient had a lab test today, I can ignore the next 45 days’ worth of lab tests. But if a patient gets a repeat lab test, say, on day 51, I want to reset the “clock” and ignore the next 45 days’ worth of lab tests.
Here’s generally what I’m trying to do:
PatientID Date DayNum KeepThis? DaysSinceLastKeptTest
1 3/2/2020 0 Yes 0
1 3/5/2020 3 3
1 4/6/2020 35 35
1 4/10/2020 39 39
1 5/15/2020 74 Yes 0 (more than 45 days since 3/2 test)
1 5/25/2020 84 10
1 5/30/2020 89 15
1 6/6/2020 96 22
1 7/1/2020 121 Yes 0 (more than 45 days since 5/15 test)
1 7/4/2020 124 3
1 7/15/2020 135 14
(in my final product, I don’t necessarily need the DayNum, KeepThis, or DaysSinceLastKeptTest columns. I just need the 3 “Yes” rows.)
I’m trying to do this in SQL, but can’t figure it out. I’ve thought about temporary tables or subqueries, and I’m just learning about OVER/PARTITION BY as a possible way to tackle this.
I appreciate any help or ideas you may have. Thanks!
Advertisement
Answer
With a recursive CTE you can do it:
with cte_rn
as
(
select *,ROW_NUMBER() over (partition by patientId order by testdate ) as seq
from patients
),
cte_recursive
as
(
select *,cte_rn.testdate as usedTestDate,1 as toUse
from cte_rn
where seq=1
union all
select rn.patientid,rn.testdate,rn.seq,case when datediff(day ,r.usedTestDate,rn.testdate)>45 then rn.testdate else r.usedTestDate end as usedTestDate,case when datediff(day ,r.usedTestDate,rn.testdate)>45 then 1 else 0 end as toUse
from cte_recursive r
join cte_rn rn
on r.seq = rn.seq-1
)
select *
from cte_recursive