General Flattening - Field name recommendations
Closed this issue · 4 comments
Review Feedback
MT001: Flattening of nested structures
applies to General Flattening of nested structures
and GeoJSON Encoding Rule for INSPIRE Addresses
Field (property) name recommendation
To promote the greatest level of interoperability in mainstream web and desktop GIS clients, the following items for field names are submitted for consideration:
- Model transformation rule:
- Reserved characters, brackets, and symbols should not be incorporated in field names [1]
- Avoid spaces and certain characters not commonly supported in field names, e.g. hyphens; parentheses; brackets; and symbols such as $, ., %, and #
- Separator
- Recommend using an underscore “_” as separator (essentially, eliminate anything that is not alphanumeric or an underscore)
- Reserved keywords should be avoided [1][2]
- Reserved characters, brackets, and symbols should not be incorporated in field names [1]
- Underscore at beginning of field names should be avoided [1]
- Known usability issue: Character length of resulting property names
- Limit field name to maximum length limits imposed by different databases, (i.e., 30 character limit imposed by Oracle) [3]
- In compound field names, evaluate substituting leading property names using common abbreviation rules or well documented code list; terminal part of the property path would be kept
[1] ArcGIS Field Naming Guidelines https://desktop.arcgis.com/en/arcmap/latest/manage-data/tables/fundamentals-of-adding-and-deleting-fields.htm#GUID-8E190093-8F8F-4132-AF4F-B0C9220F76B3
[2] List of reserved words in Access 2002 and in later versions of Access https://support.microsoft.com/en-us/help/286335/list-of-reserved-words-in-access-2002-and-in-later-versions-of-access
[3] ArcGIS Name Length Guide - Field Column Name https://pro.arcgis.com/en/pro-app/help/data/databases/database-data-and-arcgis.htm
@jsaligoe we had discussed the usage of a dot vs. using an underscore to separate field names, and decided to go with an underscore as the default option for this encoding. The reason is that we wanted to stay in line with JSON Pointer syntax. If the dot is a major problem for ArcMap/AGOL/ArcGIS pro client side processign we should re-evaluate this.
When using an RDBMS such as Oracle, I would not recommend to store fully flattened structures. How to map the conceptual models to effective relational models is the scope of a different project.
We also found issues in other client tools (e.g. pygeoapi) with the dot as separator and would therefore propose to switch to underscores.
OK, I will switch back to underscores then to optimize data usability.
I recommend creating a follow-up ticket for additional rules that model transformation rules focused on shortening field lengths.