I am trying to create a recursive query that relies on a LEFT JOIN condition, but I am not sure if it is possible, especially in Snowflake.
I have three tables: ITEM, ITEMHIERARCHY, and ITEMVALUE
CREATE TABLE ITEM
(
NAME STRING
);
INSERT INTO ITEM(NAME)
VALUES
('Item1'),('Item2'),('Item3'),('Item4'),('Item5'),('Item6');
CREATE TABLE ITEMHIERARCHY
(
ITEM STRING,
SUBITEM STRING
);
INSERT INTO ITEMHIERARCHY(ITEM,SUBITEM)
VALUES
('Item2','Item3'),('Item2','Item4'),('Item4','Item5'),('Item6','Item4');
CREATE TABLE ITEMVALUE
(
ITEM STRING,
VALUE NUMERIC(25,10)
);
INSERT INTO ITEMVALUE(ITEM,VALUE)
VALUES
('Item1',34.2),('Item3',40.5),('Item5',20.3),('Item6',77.7);
My goal is to return a list of all ITEMs with values and sub-item values rolled-up:
Item1, 34.2 Item2, 60.8 //roll-up of Item3 + Item4 Item3, 40.5 Item4, 20.3 //roll-up of Item5 Item5, 20.3 Item6, 77.7 //since Item6 value is given, dont roll-up from Item4
Note that even though Item6 is a roll-up from Item4 because there is already a given value of 77.7 on the ITEMVALUE table, the roll-up is ignored.
Here is my attempt at a failing recursive query due to the LEFT JOIN in the UNION ALL clause:
WITH RECURSIVE ITEMHIERARCHYFULL
-- Column names for the "view"/CTE
(ITEM,SUBITEM,VALUE)
AS
-- Common Table Expression
(
-- Anchor Clause
SELECT it.NAME ITEM, ih.SUBITEM, iv.VALUE
FROM ITEM it
--These left-joins work
LEFT JOIN ITEMVALUE iv ON iv.ITEM = it.NAME
LEFT JOIN ITEMHIERARCHY ih ON ih.ITEM = it.ITEM
AND iv.VALUE IS NULL
UNION ALL
-- Recursive Clause
SELECT ihf.ITEM, ih.SUBITEM,
IFF(ihf.VALUE IS NOT NULL,ihf.VALUE,iv.VALUE)
FROM ITEMHIERARCHYFULL ihf
LEFT JOIN ITEMVALUE iv ON iv.ITEM = ihf.SUBITEM
LEFT JOIN ITEMHIERARCHY ih ON ih.ITEM = ihf.SUBITEM
AND iv.VALUE IS NULL
)
-- This is the "main select".
SELECT ITEM, SUM(VALUE) AS VALUE
FROM ITEMHIERARCHYFULL
GROUP BY ITEM
ORDER BY ITEM
;
The goal of the query is to first get all top level ITEMs from the ITEM table, search for a corresponding value on the ITEMVALUE table, and, if none is found, join to the ITEMHIERARCHY Table to retrieve all SUBITEMs that compose the top level ITEMs. I would then like to recursively search on the ITEMVALUE table for a SUBITEM-VALUE match, or, if none is found, retrieve the SUBITEMs from the ITEMHIERARCHY table.
The first set of LEFT-JOINs work, but not the ones under the UNION ALL giving me the error:
SQL compilation error: OUTER JOINs with a self reference are not allowed in a recursive CTE.
Is there a better way to do what I am trying to do in Snowflake or am I not thinking about this correctly?
Currently I manually wrote out the recursive layers to 5 levels meaning I have to add a level if the ITEMHIERARCHY table becomes more complex.
Advertisement
Answer
Here’s a working example that gives you the results you expected. You can also view it on SQLFiddle.
WITH CTE AS
(
SELECT
i.NAME
, IH.SUBITEM AS descendant
, CASE WHEN IV.VALUE IS NULL THEN 1 ELSE 0 END AS LEVEL
FROM ITEM AS i
LEFT JOIN ITEMHIERARCHY AS IH
ON i.NAME = IH.ITEM
LEFT JOIN ITEMVALUE AS IV
ON I.NAME = IV.ITEM
UNION ALL
SELECT
CTE.NAME
, sIH.SUBITEM
, 1 AS LEVEL
FROM CTE
INNER JOIN ITEM AS si
ON CTE.descendant = si.NAME
INNER JOIN ITEMHIERARCHY AS sIH
ON si.NAME = sIH.ITEM
), CTE2 AS
(
SELECT
CTE.NAME
, LEVEL
, SUM(IV.VALUE) AS VALUE
, ROW_NUMBER()OVER(PARTITION BY CTE.NAME ORDER BY CTE.LEVEL ASC) AS RNK
FROM CTE
LEFT JOIN ITEMVALUE AS IV
ON (CTE.LEVEL=0 AND CTE.NAME = IV.ITEM)
OR (CTE.LEVEL <> 0 AND CTE.descendant = IV.ITEM)
GROUP BY CTE.NAME, CTE.LEVEL
)
SELECT
NAME
, VALUE
FROM CTE2
WHERE RNK = 1
ORDER BY
NAME
;
RESULTS:
NAME VALUE Item1 34.2000000000 Item2 60.8000000000 Item3 40.5000000000 Item4 20.3000000000 Item5 20.3000000000 Item6 77.7000000000