Below is a sample dataset showing TV sessions of each TV set of each household. Household “111” switch on their TV “1” at 500 and switch it off at 570. However, this has been captured in the data as 2 separate rows. You will have to write a query to convert this into a single row. Similar modification needs to be made to all other subsequent occurrences. Please note that a single valid TV session can be split into more than 2 rows as well (As shown by rows 5-8).
Input : Table [session]
Household_ID TV_Set_ID Start_time End_time 111 1 500 550 111 1 550 570 111 1 590 620 111 1 650 670 111 2 660 680 111 2 680 700 111 2 700 750 111 2 750 770 112 2 1050 1060 113 1 1060 1080 113 1 1080 1100 113 1 1100 1120 113 1 1500 1520
Expected Output :-
Household_ID TV_Set_ID Start_time End_time 111 1 500 570 111 1 590 620 111 1 650 670 111 2 660 770 112 2 1050 1060 113 1 1060 1120 113 1 1500 1520
I tried to find the lead time and find the difference and calculate the difference between that and the End time and thought I could group by but then that logic wont work since we dont just want the start and end time but even the gaps in the sessions. I’m stuck with the logic. Could someone tell how to proceed further ?
with result as ( select Household_ID, TV_Set_ID, Start_time, End_time, lead(Start_time) over (partition by Household_ID, TV_Set_ID order by Household_ID, TV_Set_ID) as lead_start from session ) select *,lead_start - End_time as diff from result ;
Advertisement
Answer
Here is a way to get this done
In the data block i create groups which is defined as any record whose previous end_time doenst match with my start_time and assign a group_number to it if its different, else i keep it same.
After that in the main block i group by this group_number, along with the household_id,tv_set_id to get the results.
with data as ( select * ,case when lag(end_time) over(partition by household_id,tv_set_id order by end_time) <> start_time then sum(1) over(partition by household_id,tv_set_id order by end_time) else sum(0) over(partition by household_id,tv_set_id order by end_time) end as group_number from t ) select household_id ,tv_set_id ,min(start_time) as start_time ,max(end_time) as end_time from data group by household_id,tv_set_id,group_number +--------------+-----------+------------+----------+ | household_id | tv_set_id | start_time | end_time | +--------------+-----------+------------+----------+ | 111 | 1 | 500 | 570 | | 111 | 1 | 590 | 620 | | 111 | 1 | 650 | 670 | | 111 | 2 | 660 | 770 | | 112 | 2 | 1050 | 1060 | | 113 | 1 | 1060 | 1120 | | 113 | 1 | 1500 | 1520 | +--------------+-----------+------------+----------+
db fiddle link https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=ba5ade186ebc3cf693c505d863691670