Skip to content
Advertisement

how can i loop insert query?

I have a problem like this: I need to optimize the application, with db (postgreSQL), the table looks like this:

CREATE TABLE voter_count(
                    id SERIAL, 
                    name VARCHAR NOT NULL, 
                    birthDate DATE NOT NULL, 
                    count INT NOT NULL, 
                    PRIMARY KEY(id), 
                    UNIQUE (name, birthDate))

I have more than a thousand such voters, and I need to put all of them in the database, but among them there are several duplicates that could vote several times (from 2 to infinity), and I need to, when meeting such a duplicate, increase the count field for an existing one (for a voter with the same name and birthdate). Previously, I just checked whether there is such a voter in the table or not, and if there is, then find it and increase the count.

But the program worked for too long, and I tried to do it through MULTI INSERT and use ON CONFLICT DO UPDATE to increase count, but I get an error, then I asked a question on stackoverflow, and I was offered to do a lot of INSERTs, through a loop, but in PostgreSQL.

INSERT INTO voter_count(name, birthdate, count) 
VALUES 
('Ivan', '1998-08-05', 1), 
('Sergey', '1998-08-29', 1), 
('Ivan', '1998-08-05', 1) 
ON CONFLICT (name, birthdate) DO UPDATE SET count = (voter_count.count + 1)

Question: how to do INSERT in a loop through PostgreSQL.

Advertisement

Answer

Probably the best option is to insert before all the data in a table without primary key, for instance:

CREATE TABLE voter_count_with_duplicates(
                name VARCHAR NOT NULL, 
                birthDate DATE NOT NULL)

and then insert the data with a single statement:

INSERT INTO voter_count (name, birthDate, count)
SELECT name, birthDate, COUNT(*)
FROM voter_count_with_duplicates
GROUP BY name, birthDate

Note that if you have the data in a structured text file (for instance a CSV file), you can insert all the data into voter_count_with_duplicates with a single COPY statement.

If you have to insert (a lot of) new data with the table already populated, there are several possibilities. One is to use the solution in the comment. Another one is to perform an an update and an insert:

 WITH present_tuples AS 
  (SELECT name, birthDate, COUNT(*) AS num_of_new_votes
   FROM voter_count_with_duplicates d JOIN voter_count c ON
         v.name = d.name and v.birthDate = d.birthDate
   GROUP BY name, birthDate)
 UPDATE voter_count SET count = count + num_of_new_votes
           FROM present_tuples
           WHERE present_tuples.name = voter_count.name
             AND present_tuples.birthDate = voter_count.birthDate;

WITH new_tuples AS 
  (SELECT name, birthDate, COUNT(*) AS votes
   FROM voter_count_with_duplicates d 
   WHERE NOT EXISTS SELECT * 
                    FROM voter_count c
                    WHERE v.name = d.name and v.birthDate = d.birthDate
   GROUP BY name, birthDate)
INSERT INTO voter_count (name, birthDate, count)
SELECT name, birthDate, votes
FROM new_tuples;
User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement