Skip to content
Advertisement

Merging Respective Start and End Dates, Setting Flag Depending on Start/End Date – SQL

Problem: I have a table of transactions (see below) with either (open/start) or (close/end) transaction in the format of date. The task is to merge these transactions with their corresponding dates, however, there can be also cases when the transaction is opened/started, but not closed/ended, in which case only Start Date must be shown, and the Flag ‘Y’ assigned. The rest of cases where there is an Enddate, the flag will be ‘N’. I have been able to set a flag depending on the values with first_value(Flag IGNORE NULLS) over(partition by Customer_ID order by EndDate desc), but Im having a bit harder time with matching the start with end dates.

Logic:

  • There can be multiple start & end dates per one day with only time differences.
  • There can be just one start & end date transaction per customer per day.
  • There can be just start date (open) transaction per customer per day.
  • Flag ‘N’ only for closed transactions, ‘Y’ for open i.e. no end date.

Current set example looks like this: Note: for this particular customer we have equal amount of transactions (inserts & deletes so I know the flag will be ‘N’, when we map start and enddates, but the solution also has to work when there are more start dates than enddates for other customers, where the final (newest) start date row will have a flag of ‘Y’.

Customer_Id Start Date End Date Flag
111 02/07/2020 20:58:32.000000 N
111 02/07/2020 19:18:04.000000 N
111 01/06/2020 09:38:49.000000 N
111 01/06/2020 09:36:34.000000 N
111 29/05/2020 16:58:07.000000 N
111 02/07/2020 20:57:52.000000 N
111 02/07/2020 19:17:22.000000 N
111 29/05/2020 16:58:06.000000 N
111 01/06/2020 09:38:40.000000 N
111 01/06/2020 09:36:34.000000 N

Expected result:

Customer_Id Start Date End Date Flag
111 02/07/2020 20:57:52.000000 02/07/2020 20:58:32.000000 N
111 02/07/2020 19:17:22.000000 02/07/2020 19:18:04.000000 N
111 01/06/2020 09:38:40.000000 01/06/2020 09:38:49.000000 N
111 01/06/2020 09:36:34.00000 01/06/2020 09:36:34.000000 N
111 29/05/2020 16:58:06.000000 29/05/2020 16:58:07.000000 N

Partial solution that works for the above example:

select customer_id
  ,start_date
  ,lead(end_date) -- find the next row's end date
   over (partition by customer_id
         order by coalesce(start_date, end_date)) as new_end
  ,case when new_end is null then 'Y' else 'N' end as flag
from tab
qualify start_date is not null -- only return starting rows
order by 1,2;

However, in case there are more start dates than end dates (e.g. several closed transactions, one open (can be not only the latest/newest one, but also somewhere in the middle..)) it will not work. E.g. in case a transaction is open: start date 12/01/2019 09:36:34.00000 but no end date then the lead() skips over this null enddate value and assigns the next available end date, leaving the solution looking like this:

Customer_Id Start Date End Date Flag
111 02/07/2020 20:57:52.000000 02/07/2020 20:58:32.000000 N
111 02/07/2020 19:17:22.000000 02/07/2020 19:18:04.000000 N
111 01/06/2020 09:38:40.000000 01/06/2020 09:38:49.000000 N
111 12/01/2019 09:36:34.00000 18/11/2019 16:58:07.000000 N
111 18/11/2019 16:58:06.000000 NULL NULL

When it should look like this:

Customer_Id Start Date End Date Flag
111 02/07/2020 20:57:52.000000 02/07/2020 20:58:32.000000 N
111 02/07/2020 19:17:22.000000 02/07/2020 19:18:04.000000 N
111 01/06/2020 09:38:40.000000 01/06/2020 09:38:49.000000 N
111 12/01/2019 09:36:34.00000 NULL Y
111 18/11/2019 16:58:06.000000 18/11/2019 16:58:07.000000 N

How to find a solution that fits both cases?

Advertisement

Answer

This will return the matching end date for each start date.

select customer_id
  ,start_date
  ,lead(end_date) -- find the next row's end date
   over (partition by customer_id
         order by coalesce(start_date, end_date)) as new_end
  ,case when new_end is null then 'Y' else 'N' end as flag
from tab
qualify start_date is not null -- only return starting rows
order by 1,2;
User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement