I am trying to write my table to a PostgreSQL database then declare a primary key. My data is really huge (includes billions of rows and total size is approximately 150 GB), when I try to create a primary key after writing the table, it takes forever. Here is how I define the primary key;
ALTER TABLE my_huge_table ADD CONSTRAINT huge_pk PRIMARY KEY (column_x,column_y)
I am %101 sure about the uniqueness of these columns and they are not null. Waiting for nothing is really unnecessary, I am seeking to find an escape route against wasting time. I surely think there are ones who faced a situation like this. Waiting for your precious help.
Advertisement
Answer
You are not waiting for nothing. You are probably waiting for an index to be built. That index will be needed to enforce the primary key in the future. If the system trusted your declaration that it was already unique, that really wouldn’t get you much, it would still need to build the index. If you have faith the primary key is not violated now, and also have faith no one will try to violate it in the future, then don’t add the primary key. Just add a comment saying you know this is a primary key, but for performance reasons won’t formally declare it.