Skip to content
Advertisement

How to ignore duplicates without unique constraint in Postgres 9.4?

I am currently facing an issue in our old database(postgres 9.4) table which contains some duplicate rows. I want to ensure that no more duplicate rows should be generated.

But I also want to keep the duplicate rows that already has been generated. Due to which I could not apply unique constraint on those columns(multiple column).

I have created a trigger which would check the row if already exists and raise exception accordingly. But it is also failing when concurrent transactions are in processing.

Example :

TAB1

col1   |  col2  |  col3  |
------------------------------------
1      |  A     |  B     |   -- 
2      |  A     |  B     |   -- already present duplicates for column col2 and col3(allowed)
3      |  C     |  D     |

INSERT INTO TAB1 VALUES(4 , 'A' , 'B') ; -- This insert statement will not be allowed.

Note: I cannot use on conflict due to older version of database.

Advertisement

Answer

Presumably, you don’t want new rows to duplicate historical rows. If so, you can do this but it requires modifying the table and adding a new column.

alter table t add duplicate_seq int default 1;

Then update this column to identify existing duplicates:

update t
    set duplicate_seq = seqnum
    from (select t.*, row_number() over (partition by col order by col) as seqnum
          from t
         ) tt
    where t.<primary key> = tt.<primary key>;

Now, create a unique index or constraint:

alter table t add constraint unq_t_col_seq on t(col, duplicate_seq);

When you insert rows, do not provide a value for duplicate_seq. The default is 1. That will conflict with any existing values — or with duplicates entered more recently. Historical duplicates will be allowed.

User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement