FamilySearch/GEDCOM

Shortening tags is not important.

Closed this issue · 2 comments

Might be worthwhile to unlearn the tendency to shrink the size of tags. There is no sacredness to the number four, and indeed, all along we've had three (SEX) and five (underscore plus four).

Now there is at least one more SCHMA. What is the benefit of removing the 'E'? Humans don't often read GEDCOM directly, but some of us do, and too much push to shorten tags is at best an inconvenience to those who aren't native speakers of English.

I'm not suggesting the other extreme of having really long ones, but four should not be the goal.

There comes a point where tag length 'feels' too long, given what is required for expressiveness and differentiating between other tags. Everyone will have a different view on this. For me, 6 characters does feel longer than necessary. This is reinforced by the fact that I write raw GEDCOM as well as read it. There is also something to be said of having tags that are similar in length so that the components align across lines and the cognitive burden of reading it is minimised.

The steering committee has generally tried to balance the desire to be brief, the desire to match the style of previous versions, and the desire to be expressive when read by English speakers. For example, when introducing PHRASE we spelled it out in full but when introducing SDATE we abbreviated "Sort date". However, more important than any of these has been avoiding ambiguity and name conflict, which brings us to the specific example:

Now there is at least one more SCHMA. What is the benefit of removing the 'E'?

GEDOCM 5.3 (only; not the versions before or after it) had a SCHEMA with different semantics. We picked a different tag to avoid potential name collisions with old files.