kensho-technologies/graphql-compiler

MSSQL maximum identifier length exceeded

Opened this issue · 2 comments

The maximum identifier (column, alias, or cte name) length in mssql is 128 characters. We currently exceed that limit on queries that are not too crazy in various ways:

  1. Columns exposed in a cte are labelled with "_".join(vertex_path) + "__" + used_column_name. This would exceed the character limit in a query with a deep traversal before a @recurse is used.
  2. In fold scopes we prepend folded_subquery_ to existing mssql identifiers. This can only be a problem if the existing column name is very long. Looks unlikely and low-priority to me.
  3. For table aliases we append _1, _2, etc to an existing table name. This is also unlikely to cause problems, so it's an extremely low-priority issue.

To address the cte column labeling problem, we can simply choose arbitrary or numbered labels. It will work as long as they are unique.

@chewselene please correct me if I'm missing something.

The identifier limit in OracleDB is 30 characters.

Do we have type information about the scope identified by such an identifier? E.g. would it be easy to name locations of type Foo with names like Foo_1, Foo_2, Foo_3? While I like the vertex_path naming approach since it makes locations structurally obvious, it's going to be near-impossible to keep that below 30 chars.

We may still need to add some logic to keep identifiers distinct and below 30 chars, but I suspect this is perhaps our best long-term approach.