I have a table like below.
+-------+-------+--------------+---------------+ | Col_1 | Col_2 | Refresh_Date | Refresh_Value | +-------+-------+--------------+---------------+ | AE | A1 | 2019-12-01 | 1 | | AE | A1 | 2020-01-01 | 3 | | AE | A1 | 2020-02-01 | 5 | | AE | A1 | 2020-03-01 | 7 | | AE | A1 | 2020-04-01 | 12 | | AE | A1 | 2020-05-01 | 14 | | AE | A1 | 2020-06-01 | 11 | | AE | A1 | 2020-07-01 | 15 | +-------+-------+--------------+---------------+
I need to get the first Refresh_value
(based on Refresh_date
) from last 3 months starting from the last date and there should be 2 additional columns (Group
and Refresh_Value_Min
) where 1st column will have the first value from every last 3 months and another column will have values which says in which group these dates fall into.
Expected output
+-------+-------+--------------+---------------+-------+-------------------+ | Col_1 | Col_2 | Refresh_Date | Refresh_Value | Group | Refresh_Value_Min | +-------+-------+--------------+---------------+-------+-------------------+ | AE | A1 | 2019-12-01 | 1 | Grp3 | 1 | | AE | A1 | 2020-01-01 | 3 | Grp3 | 1 | | AE | A1 | 2020-02-01 | 5 | Grp2 | 5 | | AE | A1 | 2020-03-01 | 7 | Grp2 | 5 | | AE | A1 | 2020-04-01 | 12 | Grp2 | 5 | | AE | A1 | 2020-05-01 | 14 | Grp1 | 14 | | AE | A1 | 2020-06-01 | 11 | Grp1 | 14 | | AE | A1 | 2020-07-01 | 15 | Grp1 | 14 | +-------+-------+--------------+---------------+-------+-------------------+
I tried the below code that will give the value of the 3rd last month in the current row, but I need the output as like above.
first_value(Refresh_Value) over (partition by col_1,col_2 order by Refresh_Date ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)
Can someone help in this.
Please let me know if there are any questions.
Advertisement
Answer
Let me explain the approach (tiny details might differ):
- Get last date in each row
with data_with_last_dt as ( select Col_1, Col_2, Refresh_Date, Refresh_Value, max(Refresh_Date) over (partition by Col_1, Col_2) as Last_Date from target_table ),
- Get the difference in months and divide it by 3 (integer division) — you’ll get the group number
data_with_group as ( select Col_1, Col_2, Refresh_Date, Refresh_Value, cast(months_between(Last_Date, Refresh_Date) as int) / 3 as Group_Id from data_with_last_dt )
- Find the first
Refresh_Value
within each group:
select Col_1, Col_2, Refresh_Date, Refresh_Value, Group_Id, min(Refresh_Value) over(partition by Col_1, Col_2, Group_Id order by Refresh_Date) as Refresh_Value_Min from data_with_group