I’m trying to write a query for Postgres to insert a row into one table which as a side-effect may insert a row into a second table (due to the first table having a foreign key constraint on the second).
I suppose I could do this as two separate queries, possibly inside a transaction, but now that I’ve started down the path of a single query, I’d like to know how it might be done, if it’s possible at all.
I’ve been trying to use the general structure laid out in How to use RETURNING with ON CONFLICT in PostgreSQL? and Insert if names do not exist and return ids if exist, which seemed like a reasonable starting point, but I’m getting the error WITH clause containing a data-modifying statement must be at the top level
, and I’m not sure how to follow that advice.
Table a
is something like this:
CREATE TABLE a ( a_uuid uuid PRIMARY KEY, b_uuid uuid REFERENCES b(b_uuid) )
and table b
is:
CREATE TABLE b ( b_uuid uuid PRIMARY KEY, b_name text UNIQUE )
I want to insert a row into a
where b_uuid
is computed based on input that’ll match b_name
. If b
already has the row, then use the corresponding b_uuid
; otherwise, create a new row in b
with that text and a newly-generated UUID, returning the latter to use as part of the insert into a
. This was what I had written before I realized I was in over my head:
WITH new_b AS ( WITH input (b_uuid, b_name) AS ( VALUES ( gen_random_uuid(), $1 ) ), ins AS ( INSERT INTO b ( b_uuid, b_name ) TABLE input ON CONFLICT DO NOTHING RETURNING b_uuid ) TABLE ins UNION ALL SELECT b_uuid FROM input ) INSERT INTO a (a_uuid, b_uuid) VALUES ( gen_random_uuid(), (SELECT b_uuid FROM new_b) )
Am I anywhere close? What would be the best way of doing this?
Advertisement
Answer
demo:db<>fiddle (because of random things, you may reload several times if the random uuid equals 1; instead of type uuid
I used int
because the fiddle engine currently does not support the pgcrypto
extension. I simulated the function with an own one.)
WITH input (b_uuid, b_name) AS ( VALUES ( gen_random_uuid(), $1 ) ), ins_b AS ( INSERT INTO b ( b_uuid, b_name ) TABLE input ON CONFLICT DO NOTHING RETURNING b_uuid ), new_b AS ( TABLE ins_b UNION ALL SELECT b.b_uuid FROM input JOIN b USING (b_uuid) ) INSERT INTO a (a_uuid, b_uuid) VALUES ( gen_random_uuid(), (SELECT b_uuid FROM new_b) );
Your solution is not far away:
- Manipulating statements (like
INSERT
) cannot be inside nestedWITH
clauses (At this point: Thank you, I didn’t even know about this nested CTE feature :D) - The main point of @ErwinBrandstetter’s incredible fantastic solution (How to use RETURNING with ON CONFLICT in PostgreSQL?) is the
JOIN
that your solution is missing: Thenew_b
part works as follows: If there is a conflict,ins_b
returns nothing. So,TABLE ins_b
is empty. In that case the already existingb_uuid
needs to be called directly fromTABLE b
. Taking the generated UUID, joining it againstb
gives out the existingb_uuid
(and if you wish, every other column of this record). But if there is no conflict – theb_uuid
does not exist yet -, thenins_b
returns the new data set,TABLE ins_b
is not empty, but the join on the original table fails because there is still no record persisted which can be used to join. This, of course, works for more than one record to be inserted.