/nightvac

Preemptively vacuum your PostgreSQL database during off-hours

Primary LanguagePythonMIT LicenseMIT

nightvac

Vacuuming is an important part of maintaining your PostgreSQL database. Generally Postgres' autovacuum feature takes care of this for you. However, autovacuum is triggered after a table has a certain number of rows updated, making it more likely to run during your busiest times, when the frequency of writes is highest. Autovacuum increases CPU and disk usage—During your busiest times, when resource usage is already at peak. This can last for many minutes or even hours, depending on the size of the table being vacuumed.

To be clear, Postgres can and should be configured to limit the impact of autovacuum. It comes preconfigured with sane defaults that should work for most databases. But why not vacuum when usage is lowest and impact will be the least? Enter nightvac.

Nightvac will find the tables in the database that are nearing an autovacuum and preemptively run a manual vacuum. It can be run as a cron job during off-hours and has a configurable timeout so it won't run long.

Getting Started

Nightvac is written in Python. The recommended installation method is using pipx.

pipx install https://github.com/luhn/nightvac/archive/refs/tags/v0.2.0.tar.gz

Alternatively, you can install pip. It is recommended to install into a virtualenv.

python -m venv .nightvac
.nightvac/bin/pip install https://github.com/luhn/nightvac/archive/refs/tags/v0.2.0.tar.gz

You can also download nightvac.py and run that directly—The only dependency is psycopg.

To run nightvac, you need only to pass in a connection string.

nightvac postgresql://user:pass@host/mydb

What Gets Vacuumed

By default, Postgres' autovacuum will trigger on any table that has either:

  • Max freeze age (age(pg_class.relfrozenxid)) greater than 200 million.
  • Dead tuples (pg_stat_all_tables.n_dead_tups) greater than 20% of total tuples (pgclass.reltuples) plus 50.
  • Inserted tuples (pg_stat_all_tables.n_ins_since_vacuum) greater than 20% of total tuples (pgclass.reltuples) plus 1000. (Postgres 13+ only)

By default, nightvac will preemptively vacuum any table that has either:

  • Max freeze age (age(pg_class.relfrozenxid)) greater than 150 million.
  • Dead tuples (pg_stat_all_tables.n_dead_tups) greater than 5% of total tuples (pgclass.reltuples) plus 50.
  • Inserted tuples (pg_stat_all_tables.n_ins_since_vacuum) greater than 10% of total tuples (pgclass.reltuples) plus 1000. (Postgres 13+ only)

Tables exceeding max freeze age threshold will be vacuumed first, in descending order of max freeze age. Next, tables exceeding dead tuple threshold, in descending order of dead tuple/total tuple ratio.

nightvac will exit once all qualifying tables have been vacuum or the 20 minute timeout is hit. The timeout will not interrupt a running vacuum; nightvac will finish the vacuum before exiting.

Configuration

nightvac can be configured with the following CLI arguments:

Prior Art

nightvac is heavily inspired by flexible-freeze. flexible-freeze differs from nightvac in a few ways:

  • flexible-freeze always runs VACUUM FREEZE, which is unnecessarily aggressive.
  • flexible-freeze vacuums based on max freeze age by default, or dead tuples. nightvac vacuums based on both.