How to update table with summarized view from a config table?

Question

I have a table A with data associated for an account for each month and there could be multiple entries in each month with following data as shown in snippet. Expected transformed table I want to generate table something like below as shown in below snippet. Idea is that for each distinct account_id in table …

Accepted Answer

Given that nobody has answered this yet, I have given it a shot. On the other hand, I haven’t used Snowflake nor can I find a place I can try my code with it (similar to db<>fiddle sites) so you may need to modify this to fit conventions for Snowflake.I have, however, tried to make this as simple and generic as possible. If there are any minor issues please convert it to what you need.First things first – I created the base data set, as well as a numbers table (called Num) which has the values 0 to 11. You can use any other way to generate the numbers; I just used this as a fairly generic approach to create the list of months.CREATE TABLE TableA ("Time" datetime, Account_Id varchar(3), Active varchar(1))INSERT INTO TableA ("Time", Account_Id, Active) VALUES('2020-02-25 00:00:30', '111', 'T'),('2020-02-15 00:10:30', '112', 'T'),('2020-03-17 10:00:30', '111', 'F'),('2020-04-09 01:00:24', '111', 'T'),('2020-05-18 01:00:22', '112', 'F'),('2020-01-01 11:11:11', '111', 'T'),('2020-02-28 10:00:00', '111', 'F'),('2020-05-28 01:00:22', '112', 'F');CREATE TABLE Nums (n int);INSERT INTO Nums (n) VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11);The next step is involved three steps: 2 CTEs, and the SELECT statement that draws those together.The first CTE All_Account_Months creates all the rows you need – one row per Account_Id per month.The second CTE uses a window function to work out the last Active value for a given Account_ID and month.The SELECT statement then does a LEFT JOIN between these – starting with the first CTE, then adding the extra column Active to it.For this, I wrote it in SQL Server (t-sql) – I have a db<>fiddle here with the run. However, below, I have tried to convert it to Snowflake based on the documentation (e.g., changing column identifiers to double quotes, changing the FORMAT to get the month name to TO_VARCHAR). I’m sorry though – I don’t have a place to test it.DECLARE @StartDate datetime = '2020-01-01 00:00:00';WITH All_Account_Months AS (SELECT A.Account_ID, TO_VARCHAR(DATEADD(month, N.n, @StartDate), 'MMMM yyyy') AS "Month" FROM Nums N CROSS JOIN (SELECT DISTINCT Account_ID FROM TableA ) A ),Latest_Active_Values AS (SELECT * FROM (SELECT *, TO_VARCHAR(Asub."Time", 'MMMM yyyy') AS "Month", ROW_NUMBER() OVER (PARTITION BY Asub.Account_ID, YEAR(Asub."Time"), MONTH(Asub."Time") ORDER BY Asub."Time" DESC) AS rn FROM TableA Asub ) AS A WHERE A.rn = 1 )SELECT AAM.Account_ID, AAM.Month, LAV.ActiveFROM All_Account_Months AAM LEFT OUTER JOIN Latest_Active_values LAV ON AAM.Account_Id = LAV.Account_Id AND AAM."Month" = LAV."Month";Results are as follows – the same as yours except I think you typo’ed March for 111 – it should be F not T?Account_ID Month Active111 January 2020 T111 February 2020 F111 March 2020 F111 April 2020 T111 May 2020 NULL111 June 2020 NULL111 July 2020 NULL111 August 2020 NULL111 September 2020 NULL111 October 2020 NULL111 November 2020 NULL111 December 2020 NULL112 January 2020 NULL112 February 2020 T112 March 2020 NULL112 April 2020 NULL112 May 2020 F112 June 2020 NULL112 July 2020 NULL112 August 2020 NULL112 September 2020 NULL112 October 2020 NULL112 November 2020 NULL112 December 2020 NULL

Advertisement

Answer