Skip to content
Advertisement

How to retain the first record while using Window function LAG()?

I have a data which looks like –

Actual Table –

VIN Mode Status Start End
ABC123456789 Mode 1 Waiting for Auth 01/01/2010 00:00:00 05/05/2014 14:54:54
ABC123456789 Mode 1 Waiting for URL 05/05/2014 14:54:54 05/13/2014 19:09:51
ABC123456789 Mode 1 Waiting for User 05/13/2014 19:09:51 11/13/2014 22:26:32
ABC123456789 Mode 1 Authorized 11/13/2014 22:26:32 11/13/2014 22:31:00
ABC123456789 Mode 1 Authorized 11/13/2014 22:31:00 11/14/2014 01:23:56
ABC123456789 Mode 2 Waiting for User 11/14/2014 01:23:56 11/18/2014 19:38:51
ABC123456789 Mode 2 Waiting for User 11/18/2014 19:38:51 11/18/2014 19:38:54
ABC123456789 Mode 2 Waiting for User 11/18/2014 19:38:54 11/18/2014 20:07:52
ABC123456789 Mode 2 Authorized All 11/18/2014 20:07:52 12/17/2014 19:22:50
ABC123456789 Mode 2 Authorized All 12/17/2014 19:22:50 02/25/2015 20:03:44
ABC123456789 Mode 2 Authorized All 02/25/2015 20:03:44 02/25/2015 20:03:48
ABC123456789 Mode 3 Authorized All 02/25/2015 20:03:48 02/25/2015 20:14:05
ABC123456789 Mode 3 Revoke Auth 02/25/2015 20:14:05 02/25/2015 20:14:29
ABC123456789 Mode 3 Waiting for Auth 02/25/2015 20:14:29 02/25/2015 20:40:21

I am using the below window function query to get the expected result as shown in the result output table. But I am not able to retain the first row. How to achieve it?

Hive Query –

WITH mma AS
(
select VIN, 
       Mode, 
       Status,
       case when lower(Status) like '%authorized%' then 'Authorized' else 'Deauthorized' end Event,
       Start, 
       End 
  from ModemAuth 
 where VIN = 'ABC123456789'
 order by Start
 )

select mma2.*
from (select mma.*,
             lag(event) over (partition by VIN order by Start) as Prev_Event
        from mma
     ) mma2
where Prev_Event <> Event

Expected Result

VIN Mode Status Event Start End
ABC123456789 Mode 1 Waiting for Auth Deauthorized 01/01/2010 00:00:00 05/05/2014 14:54:54
ABC123456789 Mode 1 Authorized Authorized 11/13/2014 22:26:32 11/13/2014 22:31:00
ABC123456789 Mode 2 Waiting for User Deauthorized 11/14/2014 01:23:56 11/18/2014 19:38:51
ABC123456789 Mode 2 Authorized All Authorized 11/18/2014 20:07:52 12/17/2014 19:22:50
ABC123456789 Mode 3 Revoke Auth Deauthorized 02/25/2015 20:14:05 02/25/2015 20:14:29

Advertisement

Answer

Your where condition is not correct. It should be:

where Prev_Event is null or prev_event <> Event

This is the logic I put on your previous, similar question. You can accept that answer.

User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement