I am using REDSHIFT for the below question, Here is problem and I am looking for solution.
I have 2 tables, one table contains the column combinations on which the second table should group the results by.
Table 1 containing column combinations (This is output from a stored procedure):
COMBINATIONS fruit_combinations banana,'ALL' banana, orange
Table 2 containing fruit_baskets:
FRUIT_BASKET BANANA ORANGE USER_COUNT b1 o1 5 b1 o2 10
Result set:
FRUIT_BASKET_AGG BANANA ORANGE USER_COUNT b1 'ALL' 15 b1 o1 5 b1 o2 10
Tried solution in Redshift:
PREPARE executesql(VARCHAR(10),VARCHAR(10)) AS select $1,$2, sum(fb.user_Count) as user_count from dv_product.fruit_basket fb group by 1,2; EXECUTE executesql("BANANA","ORANGE") ; DEALLOCATE executesql; Error: [Amazon](500310) Invalid operation: column "banana" does not exist;
Questions:
1 – Is there a way I can pass column names as parameter to a sql statement and execute the sql?
2 – What is other alternative to achieve the result set mentioned above in Redshift?
Advertisement
Answer
This can now be done using Stored Procedures without the need for PREPARE
. “Overview of Stored Procedures in Amazon Redshift”
It seems like you are trying to emulate GROUPING SETS
or ROLLUP
functionality. I have added a UNION ALL
to the dynamic SQL to provide this type of output.
For this example stored procedure, both column names are provided as input and a REFCURSOR
is declared as output.
CREATE PROCEDURE get_fruit_sum(IN column_1 VARCHAR, IN column_2 VARCHAR, result_set INOUT REFCURSOR) AS $$ BEGIN OPEN result_set FOR EXECUTE 'SELECT '|| quote_ident(column_1) ||' , '|| quote_ident(column_2) || ' , SUM(fb.user_Count) as user_count ' || 'FROM dv_product.fruit_basket fb GROUP BY 1,2' || 'UNION ALL ' || 'SELECT '|| quote_ident(column_1) ||' , ''ALL''' || ' , SUM(fb.user_Count) as user_count ' || 'FROM dv_product.fruit_basket fb GROUP BY 1;' RETURN; END; $$ LANGUAGE plpgsql;
You specify the columns and the output REFCURSOR when calling the procedure. The column names could be retrieved from a table by another stored procedure if needed. Then fetch the output from the REFCURSOR.
BEGIN; CALL get_fruit_sum ( 'Banana','Orange','result_set' ); FETCH ALL FROM result_set; END;