Skip to content
Advertisement

How do I find first value in every last 3 months in Hive

I have a table like below.

I need to get the first Refresh_value (based on Refresh_date) from last 3 months starting from the last date and there should be 2 additional columns (Group and Refresh_Value_Min) where 1st column will have the first value from every last 3 months and another column will have values which says in which group these dates fall into.

Expected output

I tried the below code that will give the value of the 3rd last month in the current row, but I need the output as like above.

Can someone help in this.

Please let me know if there are any questions.

Advertisement

Answer

Let me explain the approach (tiny details might differ):

  1. Get last date in each row
  1. Get the difference in months and divide it by 3 (integer division) — you’ll get the group number
  1. Find the first Refresh_Value within each group:
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement