Skip to content
Advertisement

date window function

Dont know how to solve the problem.May be you can show right direction or give a link. I have a table:

    id  Date
    23  01.01.2020
    23  03.01.2020
    23  04.01.2020
    56  07.01.2020
    56  08.01.2020
    87  11.01.2020
    23  12.01.2020
    23  18.01.2020

I want to aggregate data (id, Date_min) and add new column like this one:

id  Date_min        Date_new
23  01.01.2020  07.01.2020
56  07.01.2020  11.01.2020
87  11.01.2020  12.01.2020
23  12.01.2020  18.01.2020

In column Data_new I want to see next user’s first date. If there is no next user, add user`s max date

Advertisement

Answer

LEAD will give you the next date, but we also have the slight sticking problem that your ID repeats, so we need something to make the second 23 distinct from the first. For that I guess we can establish a counter that ticks up every time the ID changes:

with a as(        
    select '23' as id, '01.01.2020' as "date" union all
    select '23' as id, '03.01.2020' as "date" union all
    select '23' as id, '04.01.2020' as "date" union all
    select '56' as id, '07.01.2020' as "date" union all
    select '56' as id, '08.01.2020' as "date" union all
    select '87' as id, '11.01.2020' as "date" union all
    select '23' as id, '12.01.2020' as "date" union all
    select '23' as id, '18.01.2020' as "date"
), b as (
  SELECT *, LAG(id) OVER(ORDER BY "date") as last_id FROM a
), c AS(
  SELECT *, 
  LEAD("date") OVER(ORDER BY "date") as next_date, 
  SUM(CASE WHEN last_id <> id THEN 1 ELSE 0 END) OVER(ORDER BY "date" ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) id_ctr
  FROM b
)

SELECT id, MIN("date"), MAX(next_date)
FROM c
GROUP BY id, id_ctr

I haven’t got a PG instance to test this on, but it works in SQLS and I’m pretty sure that PG supports everything that SQLS does – there isn’t any SQLS specific stuff here

  • a takes the place of your table – you can drop it from your query and just straight d a with b as (select... from yourtablenamehere)
  • b calculates the previous ID; we’ll use this to detect if the id has changed between current row and prev row. If it changes we’ll put a 1 otherwise a 0. When these are summed as a running total it effectively means the counter ticks up every time the ID changes, so we can group by this counter as well as the ID to split our two 23s apart. We need to do this separately because window functions can’t be nested
  • c takes the last_id and does the running total. It also does the next_date with a simple window function that pulls the date from the following row (rows ordered by date). the ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW is techincally unnecessary as it’s the default action for a SUM OVER ORDERBY, but I find being explicit helps document/change if needed
  • then all that is required is to select the id, min date and max next_date, but throw the counter in there too to split the 23s up – you’re allowed to group by more columns than you select but not the other way round
User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement