Lets say I have these date values in my table:
#1 2019-01-01 #2 2019-02-01 #3 2019-03-01 #4 2019-05-01 #5 2019-06-01 #6 2019-06-15 #7 2019-07-01
I need to keep only the dates that are 2 months appart from the previous “good” one.
So, I start with:
#1 is the first one, I keep it as a good one.
#2 is only one month appart, so not good.
#3 is two months appart from #1 (I ignore #2 because it was not good).
#4 is two months appart from #3, so I keep it
#5 is not good because it is only one month from #4
#6 is not good because it is only one and a half month from #4 (#5 is ignored because it was not good).
#7 is good because it is two months appart from #4 which was the last good one.
Is there an easy, clean way to do that?
I started with a dense_rank() over and compare them with the previous rank but I fail to figure out how to ignore the bad dates.
Advertisement
Answer
This is an iterative process. You can solve it using a recursive CTE. Given that you are dealing with dates and months, your data is not too large, so this might be a reasonable solution.
Date arithmetic varies significantly among databases. The following is the idea:
with recursive t as ( select t.*, row_number() over (order by datecol) as seqnum from mytable t ) t, cte as ( select datecol, seqnum, datecol as refdate from t where seqnum = 1 union all select t.datecol, t.segnum, (case when t.datecol >= add_months(cte.refdate, 2) then t.datecol else cte.refdate end) from cte join t on t.seqnum = cte.seqnum + 1 ) select distinct refdate from cte;