Install postgres extensions locally and in prod
rebeccacremona opened this issue · 5 comments
I recently introduced a new index to the LinkUser
table, one that relies on two postgres extensions, btree_gin
and pg_trgm
.
Django includes helpers for installing the extensions during a migration, but in our version of postgres, the extensions need to be installed by a superuser. That works fine in development and CI, but not in production: the production database user is not, and should not be, a superuser.
More recent versions of postgres have the concept of "trusted extensions", which can be installed by any users with CREATE
privileges... but upgrading to a newer postgres isn't on the table right now... especially given how unimportant this new index is.
We considered a number of workarounds. (See https://hlslil.slack.com/archives/C03CYM6TF3J/p1680640861403829 for group discussion, which was picked up later by @rebeccacremona and @bensteinberg in DMs and live conversation.) Among them:
-
In keeping with the pythonic value that it's easier to ask for forgiveness than permission, wrap the installation with try/except such that failures aren't fatal.... and instead, fall through to the fatal error that would be thrown when the index creation is attempted, if the extensions aren't installed. See #3303, #3304, and #3305. That approach proved surprisingly finicky: migrations are intricate, and to make this work as desired, we'd have to further study how they use transactions.
-
Instead of doing that, we thought of changing the migration to be conditional based on a tier-level setting, so that it didn't attempt the installation at all in staging and production. See https://github.com/harvard-lil/perma/pull/3307/files.
-
We considered install the extension via docker config by customizing the initdb step, as in https://stackoverflow.com/questions/55741735/docker-compose-and-postgres-extensions. But, we didn't like that this would require us to throw away our existing volumes in dev. (Which we totally can do... it's just annoying, and means we lose a certain amount of useful-to-us captures.)
-
Or, we could make our own Dockerfile for postgres, and install it there.
-
We might also be able to get something working by using docker entrypoint script.
-
We considered adding logic to the command that devs run during local installation already, and simply explaining to devs with existing installations what do to.
-
We discussed tweaking the search path of the production database user and
GRANT
ing it access to the schema/db/table that it needs to see that the extension is already installed in prod.... so
that Django's check to see if it needs to install the extension works as intended. -
We read about what others in this or a similar situation are doing, for instance, https://www.drupal.org/project/farm/issues/3270558... and learned that there isn't a lot of consensus about what approach is best.
As of right now, we are hopeful that the settings-based conditional migration works, and is... fine enough to let us move on to something else.
Thanks for writing this up.
Okay, the conditional installation strategy did not work. Index creation fails with psycopg2.errors.UndefinedObject: operator class "gin_trgm_ops" does not exist for access method "gin"
.
Evidently, the schema situation that is preventing Django's conditional installation from working correctly, even though the extension is installed... also prevents us from creating the index.
See PostgresApp/PostgresApp#335 (comment).
Next up: fun with schemas.
I believe the conditional installation strategy did work, after I found the correct invocation for creating the extensions, performed in the Perma database by a user with the right privileges:
perma=> create extension btree_gin with schema perma;
CREATE EXTENSION
perma=> create extension pg_trgm with schema perma;
CREATE EXTENSION
After checking stage today, I propose running this on prod and then deploying.
(I think ... with schema pg_catalog
would also work, as I believe pg_catalog
is implicitly in the search path.)
We checked: with the extensions installed via the latest invocations, Django's built-in check will work fine: it will not try (and fail) to install the extensions itself in prod.
perma=> SELECT 1 FROM pg_extension WHERE extname = 'pg_trgm';
?column?
----------
1
(1 row)
perma=> SELECT 1 FROM pg_extension WHERE extname = 'btree_gin';
?column?
----------
1
(1 row)
So, we can remove the custom code :-)