lelit/pglast

Double Quotes NOT preserved after parsing

Closed this issue · 2 comments

I'm using pglast do generate dummy data for postgres dbs - pgdummy
Some tables have column names with double quote. Once the AST is created , the quote is not preserved.
Eg.

create table testquote (
    ...:     "K_1" integer,
    ...:     k_2 integer
    ...: );

Parsed AST has just 'colname': 'K_1', .. Is there anyway I can identify that the column needs to be quoted?

{'@': 'CreateStmt',
 'if_not_exists': False,
 'oncommit': {'#': 'OnCommitAction', 'name': 'ONCOMMIT_NOOP', 'value': 0},
 'relation': {'@': 'RangeVar',
              'inh': True,
              'location': 13,
              'relname': 'testquote',
              'relpersistence': 'p'},
 'tableElts': ({'@': 'ColumnDef',
                'colname': 'K_1',
                'generated': '\x00',
                'identity': '\x00',
                'inhcount': 0,
                'is_from_type': False,
                'is_local': True,
                'is_not_null': False,
                'location': 29,
                'storage': '\x00',
                'typeName': {'@': 'TypeName',
                             'location': 35,
                             'names': ({'@': 'String', 'val': 'pg_catalog'},
                                       {'@': 'String', 'val': 'int4'}),
                             'pct_type': False,
                             'setof': False,
                             'typemod': -1}},)}
lelit commented

The double quotes around an identifier are not part of the name, so it should not surprise they don't appear in the AST.
The logic used by pglast to determine weather a given name should be quoted or not is relatively simple. Basically, a simple name that does not need to be double quoted is one that

  1. starts with a lowercase letter
  2. possibly followed by lowercase letters, digits or underscores
  3. it's not a RESERVED_KEYWORDS
  4. it's not a TYPE_FUNC_NAME_KEYWORDS

This is somewhat stricter than the exact rules.

For example:

$ pgpp -S 'create table testquote ("K_1" integer, k_2 integer, "array" integer, abort integer)'
CREATE TABLE testquote (
    "K_1" integer
  , k_2 integer
  , "array" integer
  , abort integer
)

and the related serialized tree:

$ pgpp -tS 'create table testquote ("K_1" integer, k_2 integer, "array" integer, abort integer)'
[{'@': 'RawStmt',
  'stmt': {'@': 'CreateStmt',
           'if_not_exists': False,
           'oncommit': {'#': 'OnCommitAction',
                        'name': 'ONCOMMIT_NOOP',
                        'value': 0},
           'relation': {'@': 'RangeVar',
                        'inh': True,
                        'location': 13,
                        'relname': 'testquote',
                        'relpersistence': 'p'},
           'tableElts': ({'@': 'ColumnDef',
                          'colname': 'K_1',
                          'generated': '\x00',
                          'identity': '\x00',
                          'inhcount': 0,
                          'is_from_type': False,
                          'is_local': True,
                          'is_not_null': False,
                          'location': 24,
                          'storage': '\x00',
                          'typeName': {'@': 'TypeName',
                                       'location': 30,
                                       'names': ({'@': 'String',
                                                  'val': 'pg_catalog'},
                                                 {'@': 'String',
                                                  'val': 'int4'}),
                                       'pct_type': False,
                                       'setof': False,
                                       'typemod': -1}},
                         {'@': 'ColumnDef',
                          'colname': 'k_2',
                          'generated': '\x00',
                          'identity': '\x00',
                          'inhcount': 0,
                          'is_from_type': False,
                          'is_local': True,
                          'is_not_null': False,
                          'location': 39,
                          'storage': '\x00',
                          'typeName': {'@': 'TypeName',
                                       'location': 43,
                                       'names': ({'@': 'String',
                                                  'val': 'pg_catalog'},
                                                 {'@': 'String',
                                                  'val': 'int4'}),
                                       'pct_type': False,
                                       'setof': False,
                                       'typemod': -1}},
                         {'@': 'ColumnDef',
                          'colname': 'array',
                          'generated': '\x00',
                          'identity': '\x00',
                          'inhcount': 0,
                          'is_from_type': False,
                          'is_local': True,
                          'is_not_null': False,
                          'location': 52,
                          'storage': '\x00',
                          'typeName': {'@': 'TypeName',
                                       'location': 60,
                                       'names': ({'@': 'String',
                                                  'val': 'pg_catalog'},
                                                 {'@': 'String',
                                                  'val': 'int4'}),
                                       'pct_type': False,
                                       'setof': False,
                                       'typemod': -1}},
                         {'@': 'ColumnDef',
                          'colname': 'abort',
                          'generated': '\x00',
                          'identity': '\x00',
                          'inhcount': 0,
                          'is_from_type': False,
                          'is_local': True,
                          'is_not_null': False,
                          'location': 69,
                          'storage': '\x00',
                          'typeName': {'@': 'TypeName',
                                       'location': 75,
                                       'names': ({'@': 'String',
                                                  'val': 'pg_catalog'},
                                                 {'@': 'String',
                                                  'val': 'int4'}),
                                       'pct_type': False,
                                       'setof': False,
                                       'typemod': -1}})},
  'stmt_len': 0,
  'stmt_location': 0}]

Thanks for the detailed explanation @lelit .. I'll just use the function maybe_double_quote_name .. Works for me !!