/pgpsl

Public Suffix List support for PostgreSQL

Primary LanguagePLpgSQLPostgreSQL LicensePostgreSQL

psl 1.0.0

This extension contains a single PostgreSQL function, registered_domain(), that uses the Public Suffix List to return the registered domain within which a hostname exists.

Installation

To build it, do this:

make
make install
make installcheck

If you encounter an error such as:

"Makefile", line 8: Need an operator

You need to use GNU make, which may well be installed on your system as gmake:

gmake
gmake install
gmake installcheck

If you encounter an error such as:

make: pg_config: Command not found

Be sure that you have pg_config installed and in your path. If you used a package management system such as RPM to install PostgreSQL, be sure that the -devel package is also installed. If necessary tell the build process where to find it:

env PG_CONFIG=/path/to/pg_config make && make installcheck && make install

If you encounter an error such as:

ERROR:  must be owner of database regression

You need to run the test suite using a super user, such as the default "postgres" super user:

make installcheck PGUSER=postgres

Once psl is installed you can add it to a database by running, as a superuser:

CREATE EXTENSION psl;

Psl uses a compiled-in copy of the public suffix list, with no way to dynamically update it after it has been built. A snapshot is included in the distributed soure, but you can update that to the latest version by running make fetch.

Usage

registered_domain() will return the enclosing domain for any hostname, folded to lower case.

For a registered domain it will return the domain itself. For a top level domain or a hostname without periods it will return null.

As a special case, if passed an apparently correct hostname with a top level domain it doesn't recognize it will return the final two components of the hostname.

steve=# select registered_domain('foo.bar.blighty.com');
 registered_domain
-------------------
 blighty.com
(1 row)

steve=# select registered_domain('blighty.co.uk');
 registered_domain
-------------------
 blighty.co.uk
(1 row)

steve=# select registered_domain('www.blighty.co.uk');
 registered_domain
-------------------
 blighty.co.uk
(1 row)

steve=# select registered_domain('co.uk');
 registered_domain
-------------------

(1 row)

steve=# select registered_domain('co.uk.ie');
 registered_domain
-------------------
 uk.ie
(1 row)

Bugs

The upstream code from regdom-libs is broken in that the PHP code used to preprocess the PSL for the C code to read errors out. Attempting to fix that causes the embedded PSL to be corrupt in a way that causes things to SEGV. Either it needs to be fixed or replaced.

Copyright and License

Copyright 2018 Steve Atkins

This module is free software; you can redistribute it and/or modify it under the PostgreSQL License.

The core functionality is from regdom-libs, code released under the Apache license.

Test vectors were taken from libpsl, under MIT license.