Amazon Redshift databases require periodic maintenance known as vacuuming. Amazon Redshift is based on PostgreSQL, but unlike PostgreSQL, Redshift doesn’t offer autovacuum. So when a row is deleted from a table in Amazon Redshift, the disk space used by that row is not immediately recovered. A special command is necessary to recover the disk space for use by other tables.
In Amazon Redshift, the “VACUUM FULL” operation will accomplish two things:
- Sort tables (for tables that have a SORTKEY)
- Reclaim space from rows that were flagged for deletion (as from a DELETE or UPDATE operation)
In most cases, it’s not desirable to do both things at the same time. The requirements for sorting a table are very different from reclaiming space. Sorting may use a lot of resources and time.
We recommend separating the VACUUM DELETE ONLY operation from the SORT operation. The recommendation is to run VACUUM DELETE ONLY
- after every ETL operation which UPDATEs or DELETEs from a table
- nightly on all tables with “stats off” greater than 10%