Skip to content
Advertisement

SQL (Bigquery) – Populate multiple columns in existing rows

EDIT: As @Prateek suggested, I changed my working flow – Now the first query is used as “Structure-Query” that I run every time that I want to clear and recreate my table with the desired variables structure. The second query became my “Population-Query”, where I fill my empty structured table. The meaning is that now I have a bunch of sub-tables that my population-query joins and push at once.

Given this structured table with no rows:

CREATE OR REPLACE primary_table (
        EVENT_ID int64,
        TREATMENT_TIME timestamp,
        AGE int64,
        HEIGHT int64,
        WEIGHT int64,
        SEX string, 
        ICU_ADMIT timestamp,
        ICU_DISCH timestamp,
        ICUSTAYS_COUNT int64,
);

And after the initial population with the total wanted rows at the end of the day:

INSERT INTO primary_Table (
        EVENT_ID,
        TREATMENT_TIME
    )
   SELECT 
        event_id,
        cast(treatment_time as timestamp)
   FROM some_table_1

I have this half filled primary_table:

EVENT_ID TREATMENT_TIME AGE HEIGHT WEIGHT SEX ICU_ADMIT ICU_DISCH ICUSTAYS_COUNT
32342 2156-03-30 06:41:00 UTC
32342 2156-03-30 06:41:00 UTC
45084 2165-03-29 10:23:00 UTC
45084 2165-03-29 10:23:00 UTC

I want to start and populate more columns in the existing rows. The rows should be update on primary_table.EVENT_ID = some_table_2.EVENT_ID.

For example, this table as some_table_2:

EVENT_ID HEIGHT WEIGHT SEX
32342 45 66.7 F
32342 70 102 M
45084 40 80.3 M
45084 89 70.2 M

Should be added to the primary_table like this:

EVENT_ID TREATMENT_TIME AGE HEIGHT WEIGHT SEX ICU_ADMIT ICU_DISCH ICUSTAYS_COUNT
32342 2156-03-30 06:41:00 UTC 45 66.7 F
32342 2156-03-30 06:41:00 UTC 70 102 M
45084 2165-03-29 10:23:00 UTC 40 80.3 M
45084 2165-03-29 10:23:00 UTC 89 70.2 M
  • I want to keep the structure of the primary table as I created it and I don’t want to add new rows.
  • I need to update multiple columns each time.
  • I can’t update all the columns in one query (Bigquery quota issues).

Advertisement

Answer

Multiple rows can be inserted, but updating existing rows would be difficult in a single shot. As for updating specific values to each row, you need to add a unique WHERE clause.

A smarter solution would be to write a small piece of code, that would read the data from the source (file/network, etc) and perform a bulk update on the primary_table

Or if you are trying to copy from another table, write a Stored Function to read data from another table and based on EVENT_ID, construct a query to update primary_table

User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement