FamilySearch/GEDCOM

How to best include/define `TYPE` enumerations for Attributes and Events

Opened this issue · 2 comments

Part of creating a better set of Attributes and Events to cover aspects of both Genealogy and Family History into a future GEDCOM is to do a better job of bringing together "like" attributes and events under the same category.

This might include for example: combining multiple states of Marriage (Common Law, Religious, Civil), Burial (Cremation, Inhumation, Crypt, At Sea), Death (Natural, Murder, Suicide, Combat) or expanding attributes that may be defined as an "unstructured list" such as Physical Description DSCR.

One issue that we all have is that we can't know all states of a particular event/attribute or in the case of an unstructured list any specific information and internationalization.

The current structure of all Attributes (except DSCR) is the attribute tag (for example EDUC) to follow the tag with text "Masters of Science".

On the other hand Events currently are limited to the "exact meaning of the tag", as more Family History tags get added (various Military Related events, different type of personal relationships, burial type, death types) the need to use the TYPE tag to extend the meaning is warranted to reduce the number and better categorize the event.

A GEDCOM could look like this:

1 MARR
2 TYPE Common Law

The question is: How to define in the document the enumeration and the possibility of having an "other" and phrase for concepts not thought about today?

GEDCOM v7 currently uses enumeration sets for multiple tags, for example:

g7:enumset-MEDI

Value Meaning
AUDIO An audio recording
BOOK A bound book
CARD A card or file entry
ELECTRONIC A digital artifact
FICHE Microfiche
FILM Microfilm
MAGAZINE Printed periodical
MANUSCRIPT Written pages
MAP Cartographic map
NEWSPAPER Printed newspaper
PHOTO Photograph
TOMBSTONE Burial marker or related memorial
VIDEO Motion picture recording
OTHER A value not listed here; should have a PHRASE substructure

With GEDCOM Structure definition:

+1 FILE <Special>                            {1:M}  g7:FILE
    +2 FORM <MediaType>                      {1:1}  g7:FORM
       +3 MEDI <Enum>                        {0:1}  g7:MEDI
           +4 PHRASE <Text>                  {0:1}  g7:PHRASE

Proposed Change

Each event/attribute that enlisted a enumerated `TYPE' would need a similar layout for each tag for example:

n DSCR <Text>                              {1:1}  g7:DSCR
  +1 TYPE <Enum>                           {0:1}  g7:DSCR.TYPE
       +2 PHRASE <Text>                    {0:1}  g7:PHRASE
  +1 <<INDIVIDUAL_EVENT_DETAIL>>           {0:1}

Producing:

1 DSCR Brown
2 TYPE Hair Color

or

1 DSCR Right Cheek
2 TYPE other
3 PHRASE Scar

Sorry for the long story!

Maybe a better example could be:

A GEDCOM could look like this:

1 MARR
2 TYPE COMMON-LAW

Documentation

g7:enumset-MARR.TYPE

Value Meaning
COMMON-LAW Common Law Marriage
RELIGIOUS Religious Ceremony
CIVIL Civil Ceremony
PARTNER Partnership or Companionship
GROUP Multiple Men and Women marry at the same time but as a Monogamous Marriage
MIXED An inter-racial marriage, a term found in older documentation
OTHER A value not listed here; should have a PHRASE substructure
n MARR [Y|<NULL>]                          {1:1}  g7:MARR
  +1 TYPE <Enum>                           {0:1}  g7:MARR-TYPE
       +2 PHRASE <Text>                    {0:1}  g7:PHRASE
  +1 <<FAMILY_EVENT_DETAIL>>               {0:1}

Producing:

1 MARR 
2 TYPE CIVIL

or

1 MARR 
2 TYPE other
3 PHRASE Arranged

I like this proposal; the current system seems messy. Note that it is backwards-incompatible and could not be implemented until 8.0 as described. Adding a new enumerated TYPE-like tag (maybe KIND?) could be done in 7.1

I see this as effectively introducing a two-layer type hierarchy. There's type MARR, with subtypes COMMON_LAW and RELIGIOUS and so on. I proposed a different hierarchy in #290 with a similar intent. Two caveats come to mind with a type hierarchy.

  1. To be more useful than a flat set, applications understanding the supertype but not the subtype must be able to operate on the subtype as it it were the supertype without any error. For example, MARR.DATE must have the same meaning (both denotative and connotative) for a PARTNER as it does for a CIVIL. I think this is true for the examples given here, but would want to review each substructure carefully to make certain. In general, it requires some care in design to ensure it remains true.
  2. Once a hierarchy exists, any finite depth limit will seem arbitrary. For example three levels would allow RELIGIOUS_RITE → CHR → CHRA. This proposal says two is enough because it's what a TYPE will allows; admittedly I don't have a lot of examples beyond those two, but it does bear consideration because more depth could provide more flexibility.