I am trying to find all records that exist within a date range prior to an event occurring. In my table below, I want to pull all records that are 3 days or less from when the switch field changes from 0 to 1, ordered by date, partitioned by product. My solution does not work, it includes the first record when it should skip as it’s outside the 3 day window. I am scanning a table with millions of records, is there a way to reduce the complexity/cost while maintaining my desired results?
http://sqlfiddle.com/#!18/eebe7
CREATE TABLE productlist ([product] varchar(13), [switch] int, [switchday] date) ; INSERT INTO productlist ([product], [switch], [switchday]) VALUES ('a', 0, '2019-12-28'), ('a', 0, '2020-01-02'), ('a', 1, '2020-01-03'), ('a', 0, '2020-01-06'), ('a', 0, '2020-01-07'), ('a', 1, '2020-01-09'), ('a', 1, '2020-01-10'), ('a', 1, '2020-01-11'), ('b', 1, '2020-01-01'), ('b', 0, '2020-01-02'), ('b', 0, '2020-01-03'), ('b', 1, '2020-01-04') ;
my solution:
with switches as ( SELECT *, case when lead(switch) over (partition by product order by switchday)=1 and switch=0 then 'first day switch' else null end as leadswitch from productlist ), switchdays as ( select * from switches where leadswitch='first day switch' ) select pl.* ,'lead' from productlist pl left join switchdays ss on pl.product=ss.product and pl.switchday = ss.switchday and datediff(day, pl.switchday, ss.switchday)<=3 where pl.switch=0
desired output, capturing records that occur within 3 days of a switch going from 0 to 1, for each product, ordered by date:
product switch switchday a 0 2020-01-02 lead a 0 2020-01-06 lead a 0 2020-01-07 lead b 0 2020-01-02 lead b 0 2020-01-03 lead
Advertisement
Answer
If I understand correctly, you can just use lead()
twice:
select pl.* from (select pl.*, lead(switch) over (partition by product order by switchday) as next_switch_1, lead(switch, 2) over (partition by product order by switchday) as next_switch_2 from productlist pl ) pl where switch = 0 and 1 in (next_switch_1, next_switch_2);
Here is a db<>fiddle.
EDIT (based on comment):
select pl.* from (select pl.*, min(case when switch = 1 then switchdate end) over (partition by product order by switchdate desc) as next_switch_1_day from productlist pl ) pl where switch = 0 and next_switch_one_day <= dateadd(day, 2, switchdate);