/pg_log_userqueries

Module to log queries done by specific user types

Primary LanguageCISC LicenseISC

pg_log_userqueries is a PostgreSQL module that logs each query executed that
follows a specific pattern (superuser, user name, database name, app name,
inet addr, query, query_id). It records each query in the standard log file.
Default is to log superusers queries executed on all databases. It is also
possible to filter the queries to be logged through a regular expression.

To install pg_log_userqueries, you should untar the pg_log_userqueries tarball
anywhere you want.

You'll then need to compile it with pgxs. So the pg_config tool must be in your
path. Depending on your PostgreSQL installation, you may need to install a dev
package. Once pg_config is in your path, do "make", and then "make install".

Here are the steps on a Rocky Linux 9 minimal install with PostgreSQL 15:

```
# install required OS packages
sudo dnf install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-9-x86_64/pgdg-redhat-repo-latest.noarch.rpm
sudo dnf install -y https://dl.fedoraproject.org/pub/epel/epel-release-latest-9.noarch.rpm
sudo dnf install -y gcc git make clang ccache openssl-devel krb5-devel redhat-rpm-config
sudo dnf install -y postgresql15-server postgresql15-contrib postgresql15-devel

# compile and install pg_log_userqueries
git clone https://github.com/gleu/pg_log_userqueries.git
cd pg_log_userqueries/
export PATH=/usr/pgsql-15/bin:$PATH
make
sudo make install
```

Once it's installed, you'll need to configure PostgreSQL to make use of it:

* shared_preload_libraries='pg_log_userqueries'

There are also optional parameters. To configure them, with a PostgreSQL release
older than 9.2, you first need to add the class of pg_log_userqueries in
custom_variable_classes:

* custom_variable_classes = 'pg_log_userqueries'

In more recent release, custom_variable_classes is no longer available but you
can still configure pg_log_userqueries.

You'll be able to set the two following parameters:

* pg_log_userqueries.log_level: to choose the log level of pg_log_userqueries
  (WARNING by default)
* pg_log_userqueries.log_label: to give the prefix of the log line for
  pg_log_userqueries logs (defaults to the name of the extension).
* pg_log_userqueries.log_duration: to log statement duration time in ms.

Once this configuration is done, restart PostgreSQL.

You can specify exactly what you want to log.

If none of the following variables is set, all queries issued by superusers will be logged
in all databases, and only those.

If you set any of log_db, log_db_blacklist, log_user, log_user_blacklist,
log_addr, log_addr_blacklist, log_app, log_app_blacklist, log_query or
log_query_blacklist, superuser queries won't be systematically logged, and
you'll have to set log_superusers to on to reactivate it.

* pg_log_userqueries.log_superusers: to turn on/off logging of all superusers (off by default)
* pg_log_userqueries.log_db: to give a pipe (|) separated list of database to log.
* pg_log_userqueries.log_db_blacklist: to give a pipe (|) separated list of database to avoid logging.
* pg_log_userqueries.log_user: to give a pipe separated list of user to log.
* pg_log_userqueries.log_user_blacklist: to give a pipe separated list of user to avoid logging.
* pg_log_userqueries.log_addr: to give a pipe separated list of IP addresses to log.
* pg_log_userqueries.log_addr_blacklist: to give a pipe separated list of IP addresses to avoid logging.
* pg_log_userqueries.log_app: to give a pipe separated list of application name to log.
* pg_log_userqueries.log_app_blacklist: to give a pipe separated list of application name to avoid logging.
* pg_log_userqueries.log_query_id: to give a pipe separated list of query_id to log.
* pg_log_userqueries.log_query_id_blacklist: to give a pipe separated list of query_id to avoid logging.


For a query to be logged, it needs to :
* match the white list of a filter (eg: log_db) AND
* not match the back list of that filter (eg: log_db_blacklist).

For example :

* pg_log_userqueries.log_db="application_.*"
* pg_log_userqueries.log_db_blacklist="application_test"

This will log queries on all databases starting with "application_" except for
the database "application_test".

You can use advanced regular expression in that list. For example:

* pg_log_userqueries.log_user="postgres|admin_.*|.*_adm"

will match if the exact username is 'postgres', or if it begins with 'admin_' or
ends with '_adm'.

* pg_log_userqueries.log_query_id='1147616880456321454|11780828165390275418'

will match if the exact query_id is 1147616880456321454, or if it is 11780828165390275418.


You can also use pg_log_userqueries to log queries matching a particular regular
expression using the 'pg_log_userqueries.log_query' dedicated configuration directive:

* pg_log_userqueries.log_query: to give a regular expression to log all queries matching this regexp.

For example, to only log calls related to prepared transaction statements, you can set the following:

    pg_log_userqueries.log_query="^(PREPARE TRANSACTION|COMMIT PREPARED|ROLLBACK PREPARED)"

This filter will be applied after all other filters. For example if you have set log_user
but the user don't match nothing will be logged even if the query regexp match.

By default, if log_superusers is set, the filter is checked first and the query
is logged if the user has the superuser attribute.  Then log_user, log_db,
log_app and log_addr are checked in this order and the first matching value
will write to the log if log_query is not set or log_query match. If you want
that all of the filters, when defined, must match to write to log, activate the
match_all directive. In this case, it will log statements only when defined
filters for log_user, log_db, log_addr and log_query all match. These filters
will also apply to superuser queries if the corresponding directive is checked.

* pg_log_userqueries.match_all='on'

By default, pg_log_userqueries will write queries to PostgreSQL log destination.
A superuser can change this behavior with the pg_log_userqueries.log_destination
parameter. By default at stderr, put syslog if you prefer to use a syslog
daemon. You can also configure the syslog facility and the program
identification:

	pg_log_userqueries.syslog_facility='local0'
	pg_log_userqueries.syslog_ident='pg_log_userqueries'

As pg_log_userqueries is a shared_preload_library, you'll have to restart PostgreSQL to completely remove it. To temporarily deactivate it, you have a workaround though:

* pg_log_userqueries.file_switchoff: path to a file that must be root-owned. If this file is present, pg_log_userqueries won't log anymore
* pg_log_userqueries.time_switchoff: frequency at which backends will check file_switchoff's presence

If you need to know the current version number of this library, you can do this:

    strings $(pg_config --libdir)/pg_log_userqueries.so | grep "pg_log_userqueries version"