lelit/pglast

is_not_null always false when visiting ColumnDef

Closed this issue · 4 comments

Hi there,

First, thanks for this awesome project.

I am using visitors pattern to visit constraint node but I noticed is_not_null is always false irrespective of not null defined on the column. Is this the expectation or may it is a bug?

Example

CREATE TABLE measurement (updated_at timestamptz not null);
def visit_ColumnDef(
    self,
    ancestors: ast.Node,
    node: ast.ColumnDef,
) -> None:
    """Visit ColumnDef."""
    print(node)
    print(node.is_not_null)

output

<ColumnDef colname='updated_at' typeName=<TypeName names=(<String sval='timestamptz'>,) setof=False pct_type=False typemod=-1> inhcount=0 is_local=True is_not_null=False is_from_type=False storage='\x00' identity='\x00' generated='\x00' constraints=(<Constraint contype=<ConstrType.CONSTR_NOTNULL: 1> deferrable=False initdeferred=False is_no_inherit=False generated_when='\x00' nulls_not_distinct=False reset_default_tblspc=False fk_matchtype='\x00' fk_upd_action='\x00' fk_del_action='\x00' skip_validation=False initially_valid=False>,)>

False

This is the class representation

class pglast.ast.ColumnDef(colname=None, typeName=None, inhcount=None, is_local=None, is_not_null=None, is_from_type=None, storage=None, raw_default=None, cooked_default=None, identity=None, identitySequence=None, generated=None, collClause=None, constraints=None, fdwoptions=None, location=None)

Thanks.

lelit commented

TL;DR: That's a good question, but I cannot say if it is indeed a bug or not, I will have to forward the question to the underlying libpgquery team.

In the meantime, this is my reasoning: the ColumnDef structure effectively contains a is_not_null flag, but apparently it is never used (neither set nor queried) by the underlying parser.

This is what pglast obtains from the parser for your statement:

$ pgpp -tS "CREATE TABLE measurement (updated_at timestamptz not null);"
[{'@': 'RawStmt',
  'stmt': {'@': 'CreateStmt',
           'if_not_exists': False,
           'oncommit': {'#': 'OnCommitAction',
                        'name': 'ONCOMMIT_NOOP',
                        'value': 0},
           'relation': {'@': 'RangeVar',
                        'inh': True,
                        'location': 13,
                        'relname': 'measurement',
                        'relpersistence': 'p'},
           'tableElts': ({'@': 'ColumnDef',
                          'colname': 'updated_at',
                          'constraints': ({'@': 'Constraint',
                                           'contype': {'#': 'ConstrType',
                                                       'name': 'CONSTR_NOTNULL',
                                                       'value': 1},
                                           'deferrable': False,
                                           'fk_del_action': '\x00',
                                           'fk_matchtype': '\x00',
                                           'fk_upd_action': '\x00',
                                           'generated_when': '\x00',
                                           'initdeferred': False,
                                           'initially_valid': False,
                                           'is_no_inherit': False,
                                           'location': 49,
                                           'nulls_not_distinct': False,
                                           'reset_default_tblspc': False,
                                           'skip_validation': False},),
                          'generated': '\x00',
                          'identity': '\x00',
                          'inhcount': 0,
                          'is_from_type': False,
                          'is_local': True,
                          'is_not_null': False,
                          'location': 26,
                          'storage': '\x00',
                          'typeName': {'@': 'TypeName',
                                       'location': 37,
                                       'names': ({'@': 'String',
                                                  'sval': 'timestamptz'},),
                                       'pct_type': False,
                                       'setof': False,
                                       'typemod': -1}},)},
  'stmt_len': 58,
  'stmt_location': 0}]

and this is what it gets removing that not null constraint:

$ pgpp -tS "CREATE TABLE measurement (updated_at timestamptz);"
[{'@': 'RawStmt',
  'stmt': {'@': 'CreateStmt',
           'if_not_exists': False,
           'oncommit': {'#': 'OnCommitAction',
                        'name': 'ONCOMMIT_NOOP',
                        'value': 0},
           'relation': {'@': 'RangeVar',
                        'inh': True,
                        'location': 13,
                        'relname': 'measurement',
                        'relpersistence': 'p'},
           'tableElts': ({'@': 'ColumnDef',
                          'colname': 'updated_at',
                          'generated': '\x00',
                          'identity': '\x00',
                          'inhcount': 0,
                          'is_from_type': False,
                          'is_local': True,
                          'is_not_null': False,
                          'location': 26,
                          'storage': '\x00',
                          'typeName': {'@': 'TypeName',
                                       'location': 37,
                                       'names': ({'@': 'String',
                                                  'sval': 'timestamptz'},),
                                       'pct_type': False,
                                       'setof': False,
                                       'typemod': -1}},)},
  'stmt_len': 49,
  'stmt_location': 0}]

as you can see, the main difference is the presence/absence of the CONSTR_NOTNULL constraint, but the is_not_null flag is always left to false.

Looking down in the pristine PostgreSQL sources, I see that that transformCreateStmt() function mentioned in the ColumnDef's docstring is indeed one of the few points that sets that flag: here it calls transformColumnDefinition() that in turn updates that flag analyzing all the column's constraints.

For some reason, libpgquery variant of the PG parser does not call that transformation: from the point of view of pglast this does not make a big difference, because what it needs to reprint the statement is all there, so those flags are probably "just" an optimization needed by some other PG layer.

As said, I will ask an opinion to the libpgquery team.

lelit commented

As explained by Lukas, that flag is indeed set by a later analysis step done by PG after that parse of the statement.
For the sake of consistency pglast could mimic those further steps, but I sincerely doubt it's worth the effort. WDYT?

Hi @lelit, thanks for the feedback and the explanation. Currently, one could check if there exists not null constraint in the column definition and use that.

lelit commented

Right.