Skip to content
Advertisement

How can I write a SQL query to calculate the quantity of components sold with their parent assemblies? (Postgres 11/recursive CTE?)

My goal

To calculate the sum of components sold as part of their parent assemblies.

I’m sure this must be a common use case, but I haven’t yet found documentation that leads to the result I’m looking for.

Background

I’m running Postgres 11 on CentOS 7. I have some tables like as follows:

CREATE TABLE the_schema.names_categories (
    id INTEGER NOT NULL PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
    created_at TIMESTAMPTZ DEFAULT now(),
    thing_name TEXT NOT NULL, 
    thing_category TEXT NOT NULL
);

CREATE TABLE the_schema.relator (
    id INTEGER NOT NULL PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
    created_at TIMESTAMPTZ DEFAULT now(),
    parent_name TEXT NOT NULL, 
    child_name TEXT NOT NULL,
    child_quantity INTEGER NOT NULL 
);

CREATE TABLE the_schema.sales (
    id INTEGER NOT NULL PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
    created_at TIMESTAMPTZ DEFAULT now(),    
    sold_name TEXT NOT NULL,
    sold_quantity INTEGER NOT NULL
);

And a view like so, which is mainly to associate the category key with relator.child_name for filtering:

CREATE VIEW the_schema.relationships_with_child_catetgory AS (
    SELECT 
    r.parent_name,
    r.child_name, 
    r.child_quantity,
    n.thing_category AS child_category
    FROM 
    the_schema.relator r 
    INNER JOIN 
    the_schema.names_categories n 
    ON r.child_name = n.thing_name 
);

And these tables contain some data like this:

INSERT INTO the_schema.names_categories (thing_name, thing_category)
VALUES ('parent1', 'bundle'), ('child1', 'assembly'), ('subChild1', 'component'), ('subChild2', 'component');

INSERT INTO the_schema.relator (parent_name, child_name, child_quantity)
VALUES ('parent1', 'child1', 1),('child1', 'subChild1', 10), ('child1', 'subChild2', 2);

INSERT INTO the_schema.sales (sold_name, sold_quantity)
VALUES ('parent1', 1), ('parent1', 2);

I need to construct a query that, given these data, will return something like the following:

 child_name | sum_sold 
------------+----------
 subChild1  |       30
 subChild2  |        6
(2 rows)

The problem is that I haven’t the first idea how to go about this and in fact it’s getting scarier as I type. I’m having a really hard time visualizing the connections that need to be made, so it’s difficult to get started in a logical way. Usually, Molinaro’s SQL Cookbook has something to get started on, and it does have a section on hierarchical queries, but near as I can tell, none of them serve this particular purpose.

Based on my research on this site, it seems like I probably need to use a recursive CTE /Common Table Expression, as demonstrated in this question/answer, but I’m having considerable difficulty understanding this method and how to use this it for my case.

Aping the example from E. Brandstetter’s answer linked above, I arrive at:

WITH RECURSIVE cte AS (
    SELECT 
    s.sold_name,
    r.child_name, 
    s.sold_quantity AS total 
    FROM 
    the_schema.sales s
    INNER JOIN 
    the_schema.relationships_with_child_catetgory r 
    ON s.sold_name = r.parent_name

    UNION  ALL

    SELECT 
    c.sold_name, 
    r.child_name, 
    (c.total  *  r.child_quantity)
    FROM 
    cte c 
    INNER JOIN 
    the_schema.relationships_with_child_catetgory r
    ON r.parent_name = c.child_name
) SELECT * FROM cte 

which gets part of the way there:

 sold_name | child_name | total 
-----------+------------+-------
 parent1   | child1     |     1
 parent1   | child1     |     2
 parent1   | subChild1  |    10
 parent1   | subChild1  |    20
 parent1   | subChild2  |     2
 parent1   | subChild2  |     4
(6 rows)

However, these results include undesired rows (the first two), and when I try to filter the CTE by adding where r.child_category = 'component' to both parts, the query returns no rows:

 sold_name | child_name | total 
-----------+------------+-------
(0 rows)

and when I try to group/aggregate, it gives the following error: ERROR: aggregate functions are not allowed in a recursive query's recursive term

I’m stuck on how to get the undesired rows filtered out and the aggregation happening; clearly I’m failing to comprehend how this recursive CTE works. All guidance is appreciated!

Advertisement

Answer

Basically you have the solution. If you stored the quantities and categories in your CTE as well, you can simply add a WHERE filter and a SUM aggregation afterwards:

SELECT
    child_name,
    SUM(sold_quantity * child_quantity)
FROM cte
WHERE category = 'component'
GROUP BY child_name

My entire query looks like this (which only differs in the details I mentioned above from yours):

demo:db<>fiddle

WITH RECURSIVE cte AS (
    SELECT 
        s.sold_name,
        s.sold_quantity,
        r.child_name,
        r.child_quantity,
        nc.thing_category as category
    FROM 
        sales s
    JOIN relator r
    ON s.sold_name = r.parent_name
    JOIN names_categories nc
    ON r.child_name = nc.thing_name
    
    UNION ALL
    
    SELECT
        cte.sold_name,
        cte.sold_quantity,
        r.child_name,
        r.child_quantity,
        nc.thing_category
    FROM cte
    JOIN relator r ON cte.child_name = r.parent_name
    JOIN names_categories nc
    ON r.child_name = nc.thing_name
)
SELECT
    child_name,
    SUM(sold_quantity * child_quantity)
FROM cte
WHERE category = 'component'
GROUP BY child_name

Note: I didn’t use your view, because I found it more handy to fetch the data from directly from the tables instead of joining data I already have. But that’s just the way I personally like it 🙂

User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement