Skip to content
Advertisement

SQL how to get latest date and update every time it occurs (not using MAX)

I’m trying to get the latest date of starting event before it occurs again.

Data I have:

+-----+------------------+--------+--------+--+
| ID  |       DATE       | REGION | STATUS |  |
+-----+------------------+--------+--------+--+
| 200 | 2020-04-10 6:00  | USA    | START  |  |
| 200 | 2020-04-10 7:00  | USA    | ACTIVE |  |
| 200 | 2020-04-10 13:00 | USA    | STOP   |  |
| 200 | 2020-04-10 15:00 | USA    | START  |  |
| 200 | 2020-04-10 16:00 | USA    | ACTIVE |  |
| 200 | 2020-04-10 19:00 | USA    | ACTIVE |  |
| 200 | 2020-04-10 20:00 | USA    | STOP   |  |
+-----+------------------+--------+--------+--+

What I want to make:

+-----+------------------+--------+--------+------------------+
| ID  |       DATE       | REGION | STATUS | LAST_START_DATE  |
+-----+------------------+--------+--------+------------------+
| 200 | 2020-04-10 6:00  | USA    | START  | 2020-04-10 6:00  |
| 200 | 2020-04-10 7:00  | USA    | ACTIVE | 2020-04-10 6:00  |
| 200 | 2020-04-10 13:00 | USA    | STOP   | 2020-04-10 6:00  |
| 200 | 2020-04-10 15:00 | USA    | START  | 2020-04-10 15:00 |
| 200 | 2020-04-10 16:00 | USA    | ACTIVE | 2020-04-10 15:00 |
| 200 | 2020-04-10 20:00 | USA    | STOP   | 2020-04-10 15:00 |
| 200 | 2020-04-10 20:00 | USA    | STOP   | 2020-04-10 15:00 |
+-----+------------------+--------+--------+------------------+

I want to create another column with the latest start time and record it, then update it whenever it happens, and keep that value from there until start appears again.

What I tried:

SELECT id, date, region, status, last_start_date
 FROM (SELECT t1.*,
             MAX(CASE WHEN status = 'START' THEN date END) OVER (PARTITION BY id,region) as last_start_date
             FROM table1 t1
     ) t1
    GROUP BY id,region,status

This didn’t work since MAX only returns the latest one showing up, which isn’t what I was trying to do:

+-----+------------------+--------+--------+------------------+
| ID  |       DATE       | REGION | STATUS | LAST_START_DATE  |
+-----+------------------+--------+--------+------------------+
| 200 | 2020-04-10 6:00  | USA    | START  | 2020-04-10 15:00 |
| 200 | 2020-04-10 7:00  | USA    | ACTIVE | 2020-04-10 15:00 |
| 200 | 2020-04-10 13:00 | USA    | STOP   | 2020-04-10 15:00 |
| 200 | 2020-04-10 15:00 | USA    | START  | 2020-04-10 15:00 |
| 200 | 2020-04-10 16:00 | USA    | ACTIVE | 2020-04-10 15:00 |
| 200 | 2020-04-10 20:00 | USA    | STOP   | 2020-04-10 15:00 |
+-----+------------------+--------+--------+------------------+

What would be the right approach here to achieve my goal?

Advertisement

Answer

You almost did it – you just need to add order by clause:

DECLARE @DataSource TABLE
(
    [ID] INT
   ,[DATE] DATETIME2(0)
   ,[REGION] CHAR(3)
   ,[STATUS] VARCHAR(12)
);

INSERT INTO @DataSource ([ID], [DATE], [REGION], [STATUS])
VALUES ('200', '2020-04-10 6:00', 'USA', 'START')
      ,('200', '2020-04-10 7:00', 'USA', 'ACTIVE')
      ,('200', '2020-04-10 13:00', 'USA', 'STOP')
      ,('200', '2020-04-10 15:00', 'USA', 'START')
      ,('200', '2020-04-10 16:00', 'USA', 'ACTIVE')
      ,('200', '2020-04-10 19:00', 'USA', 'ACTIVE')
      ,('200', '2020-04-10 20:00', 'USA', 'STOP');

SELECT *
      ,MAX(CASE WHEN status = 'START' THEN date END) OVER (PARTITION BY id,region ORDER BY [DATE])
FROM @DataSource;

enter image description here

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement