Skip to content
Advertisement

Repeat previous values by date in hive

I have 2 tables – Dates and Data with data as follows: Table: Dates

Table: Data

Expected results:

I have tried using PARTITION OVER but getting duplicate vales and not all dates from Dates table, therefore not getting desired results. Really appreciate if you can help with the code in Hive SQL.

Advertisement

Answer

Use a cross join to generate the rows. Then a left join combine the two tables. And finally use last_value() to bring in missing values:

The last_value() with the second argument ignores NULL values, so it will “go back” to get the most recent non-NULL value.

User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement