tizoc/ppx_pgsql

ppx_pgsql's nullability heuristic has failed for outer joins

NightBlues opened this issue · 6 comments

"ppx_pgsql's nullability heuristic has failed" for outer joins
For example if we have 2 tables:

CREATE TABLE authors (id serial PRIMARY KEY, name varchar(255) NOT NULL);
INSERT INTO authors (id, name) VALUES (1, 'John Doe');
CREATE TABLE books (id serial PRIMARY KEY, title varchar(255) NOT NULL, author int NOT NULL REFERENCES authors(id) ON DELETE CASCADE);

Following query will have type string * string instead of string * (string option):

[%sqlf {|
SELECT
 authors.name,
 books.title
FROM authors
LEFT OUTER JOIN books ON books.author = authors.id
|}]

Because books.title is NOT NULL, but author without any book will cause nullability heuristic has failed
May be there is a way to advice nullability for ppx_pgsql?

tizoc commented

Thank you @NightBlues. This is a bit complicated, and I don't have a good solution right now, will have to check during the weekend to see if I can figure it out.

Meanwhile, something that you can do is to create views for such joins, and manually set the nullable property in postgres for each column (or just use the view as is, by default everything is nullable).

Here is the query that I use to make every column on a view non-nullable:

UPDATE pg_attribute SET attnotnull = 't'
 WHERE attrelid IN (
   SELECT oid FROM pg_class
    WHERE relname = 'name_of_view');

You can find documentation on the pg_attribute table here: https://www.postgresql.org/docs/11/catalog-pg-attribute.html

tizoc commented

To add a bit more of information. The reason modifying the contents of that table works is that it is from where ppx_pgsql gets the column type information. When you create a view, it gets a new ID, and all it's columns get new entries in that table. By modifying those entries you control what ppx_pgsql sees when fetching information for the columns on that view.

Thank you for workaround:)

There is more simple solution:

[%sqlf {|
SELECT
 authors.name,
 coalesce(books.title)
FROM authors
LEFT OUTER JOIN books ON books.author = authors.id
|}]

coalesce returns first non null value or null if all values are null, so we just confuse postgres with this function call:)

tizoc commented

Very interesting find! It didn't occur to me to use coalesce, I would have expected for postgres to infer the type of the call to be the same as the input (so, still nullable).

As I understand - the reason is that it becomes a calculated value instead of field of table, so describe returns None here https://github.com/darioteixeira/pgocaml/blob/master/src/PGOCaml_generic.ml#L1504
Also, I've found that pgocaml now has ppx and it has a flag for disabling heuristic (https://github.com/darioteixeira/pgocaml/blob/master/ppx/ppx_pgsql.ml#L159)
but I can't use it because its seems to be not compatible with dune (or I don't understand how to use it from dune) :)