Skip to content
Advertisement

PostgreSQL add new not null column and fill with ids from insert statement

I´ve got 2 tables.

CREATE TABLE content (
  id bigserial NOT NULL,
  name text
);


CREATE TABLE data (
    id bigserial NOT NULL,
    ...
);

The tables are already filled with a lot of data. Now I want to add a new column content_id (NOT NULL) to the data table. It should be a foreign key to the content table.

Is it possible to automatically create an entry in the content table to set a content_id in the data table.

For example

**content**
| id | name |
|  1 | abc  | 
|  2 | cde  |

data

| id |... |
| 1  |... |
| 2  |... |
| 3  |... |

Now I need an update statement that creates 3 (in this example) content entries and add the ids to the data table to get this result:

content

| id | name |
|  1 | abc  |
|  2 | cde  |
|  3 | ...  |
|  4 | ...  |
|  5 | ...  |

data

| id |... | content_id |
| 1  |... | 3          |
| 2  |... | 4          |
| 3  |... | 5          |

Advertisement

Answer

demo:db<>fiddle

According to the answers presented here: How can I add a column that doesn’t allow nulls in a Postgresql database?, there are several ways of adding a new NOT NULL column and fill this directly.

Basicly there are 3 steps. Choose the best fitting (with or without transaction, setting a default value first and remove after, leave the NOT NULL contraint first and add afterwards, …)


Step 1: Adding new column (without NOT NULL constraint, because the values of the new column values are not available at this point)

ALTER TABLE data ADD COLUMN content_id integer;

Step 2: Inserting the data into both tables in a row:

WITH inserted AS (                -- 1
    INSERT INTO content
    SELECT
        generate_series(
            (SELECT MAX(id) + 1 FROM content),
            (SELECT MAX(id) FROM content) + (SELECT COUNT(*) FROM data)
        ),
        'dummy text'
    RETURNING id
), matched AS (                   -- 2
   SELECT
       d.id AS data_id,
       i.id AS content_id
   FROM (
       SELECT
           id,
           row_number() OVER ()
       FROM data
   ) d 
   JOIN (
      SELECT
          id,
          row_number() OVER ()
      FROM inserted
   ) i ON i.row_number = d.row_number
)                                 -- 3
UPDATE data d
SET content_id = s.content_id
FROM (
    SELECT * FROM matched
) s
WHERE d.id = s.data_id;

Executing several statements one after another by using the results of the previous one can be achieved using WITH clauses (CTEs):

  1. Insert data into content table: This generates an integer series starting at the MAX() + 1 value of the current content‘s id values and has as many records as the data table. Afterwards the new ids are returned
  2. Now we need to match the current records of the data table with the new ids. So for both sides, we use row_number() window function to generate a consecutive row count for each records. Because both, the insert result and the actual data table have the same number of records, this can be used as join criterion. So we can match the id column of the data table with the new content‘s id values
  3. This matched data can used in the final update of the new content_id column

Step 3: Add the NOT NULL constraint

ALTER TABLE data ALTER COLUMN content_id SET NOT NULL;
User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement