Consider the following ERD for Order, Item and Article entities:
I’d like to delete Orders satisfying a complex condition, and then delete Items associated to those Orders, and finally delete Articles associated with those Items.
Cascade deleting from Orders to Items is possible as Order is a parent of Item. But cascade deleting from Item to Article isn’t possible as Item is not a parent of Article. Setting up a trigger is also not an option, since in general I don’t want to delete the Article whenever an Item is removed – it should only happen in the context of this query.
I’m working with PostgreSQL, which supports DELETE ... RETURNING statements. Unfortunatly, the result cannot be nested like this:
DELETE FROM articles WHERE id IN
(DELETE FROM items WHERE order_id IN
(DELETE FROM orders WHERE complex_condition RETURNING id)
RETURNING article_id)
What is the most efficient way to execute all three delete statements? Each table contains tens of millions of records, and the complex_condition is the most time-consuming part, so I prefer not to execute it more than once.
One idea I have is to create a temporary table like this:
CREATE TEMP TABLE id_of_order_to_be_deleted AS WITH cte1 AS (SELECT id FROM orders WHERE complex_condition) SELECT * FROM cte1;
And then use it to delete Orders and Items. This way, complex_condition is only evaluated once. But I think this is an overkill, and there should be a simpler solution.
Advertisement
Answer
The result can be nested using a data modifying CTE:
with deleted_orders as ( DELETE FROM orders WHERE complex_condition RETURNING id ), deleted_items as ( DELETE FROM items WHERE order_id IN (select id from deleted_orders) returning article_id ) DELETE FROM articles WHERE id IN (select article_id from deleted_items);
You might want to try where exists condition as an alternative – sometimes those are faster than IN conditions.
