Skip to content
Advertisement

PostgreSQL: VACUUM FULL duration estimation

I inherited a PostgreSQL database in production with one table that is around 250 GB in size. It only has around ten thousand live rows which I estimate to be not more than 20 MB.

The table grew to such a size because AUTOVACUUM has been turned off at some time. (I know why this was done. It will be reactivated and the original issue has been fixed, so this is not part of the question.)

Our problem is that many queries take pretty long time. For example, a SELECT count(*) FROM foo; takes around 15 minutes.

Now after considering other options, I’d like to run a VACUUM FULL on the table. I try to estimate the duration this would take to complete so I can plan a maintenance window.

In my understanding, VACUUM FULL creates a new table, copies all live tuples to it and replaces the original table with this copy.

My estimation would be that this process doesn’t take much longer than a simple query like the above on this table as the live data is pretty slim in overall size and count.

Would you agree that my expectation of the run time of ‘VACUUM FULL’ is somehow realistic? If not, why not?

Are there best practises for estimating VACUUM FULL durations?

Advertisement

Answer

The only dependable estimate can be had by restoring a file system backup on a similar machine and test it. That’s what I would recommend.

The duration will not only depend on the size, but also on the amount of bloat: if there are fewer real data, it will be faster.

That said, I’d ask for a maintenance window of 2 hours, which should be ample on anything but very questionable hardware.

User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement