Skip to content
Advertisement

Impala: Split single row into multiple rows based on Date and time

I want to split a single row into multiple rows based on time.

Expected output is below:

Day start from 00:00 AM to next day 00:00 AM. When EndDate time is greater than 00:00 AM (midnight) then split this date in two rows. First row end date is 30/03/2020 11:59:00 and next row start 31/03/2020 00:00:00.

Please help me to get is solved.

Advertisement

Answer

This would be a good spot for a recursive CTE, but unfortunatly Hive does not support those. Here is another aproach, that uses a derived table of numbers to split the periods:

You can expand the subquery to handle more possible periods per row.

Also note that this generates half-open intervals, where the end of the preceding interval is equal to the start of the next one (while in your resultset there is a one minute lag). The logic is that the interval is inclusive on its smaller bound and exclusive on the the outer bound (that way, you make sure to not leave any gap).

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement