FamilySearch/GEDCOM

EXID.TYPE values for FamilySearch

Closed this issue · 10 comments

Since the spec is the FamilySearch GEDCOM 7 specification, I would expect somewhere to specify what the EXID.TYPE value is for EXIDs that point to FamilySearch. https://gedcom.io/migrate/#afn-rfn-rin does so for AFNs but Ancestral File is obsolete.

FamilySearch has at least the following types of relevant identifiers:

ID Possible URI Possible GEDCOM 7 use
Person id (pid) https://api.familysearch.org/platform/tree/persons <<INDIVIDUAL_RECORD>>.EXID
Place id (pid) https://api.familysearch.org/platform/places PLAC.EXID
Memory id (mid) https://api.familysearch.org/platform/memories/memories <<OBJECT_RECORD>>.EXID
Source description id (sdid) https://api.familysearch.org/platform/sources/descriptions <<SOURCE_RECORD>>.EXID
User id (uid) https://api.familysearch.org/platform/users/agents <<SUBMITTER_RECORD>>.EXID
etc.

Options for FamilySearch URI(s):
A) The API URI, as shown in the table.
B) The web interface URI, e.g. https://www.familysearch.org/tree/person/details/ for Person id.
C) Some FamilySearch URI that does not resolve to an actual page but is still specific to the type of ID.
D) Some generic FamilySearch URL that is the same for all IDs, such as https://www.familysearch.org

The spec says about EXID.TYPE:

The authority issuing the EXID (p.66), represented as a URI. It is recommended that this
be a URL. If the authority maintains stable URLs for each identifier it issues, it is recommended
that the TYPE (p.82) payload be selected such that appending the EXID payload to it
yields that URL. However, this is not required and a different URI for the set of issued
identifiers may be used instead.

So that recommendation would argue for either option A or B. However FamilySearch has been known to change the URIs when revising the APIs or web interface, so only FamilySearch could say whether they are "stable". I would argue that choosing a or b is desirable since worst case the URIs change and the current URIs then become equivalent to C, as allowed in the last quoted sentence.

My preference is for option A for consistency, since I don't think option B has URLs for all of the IDs in the table, only some of them.

GEDCOM Steering Committee discussion 3/15/2022: hybrid of A and C would be to define new URIs/URLs and try to get them to redirect to the actual URIs (say of type A) when that can be done by the web infrastructure. Russ to check with Jimmy and Gordon.

The only persistent FamilySearch URIs are the ones with an /ark: in the URI.

Tree: https://www.familysearch.org/ark:/61903/4:1:KWJF-Z6P
Record Persona: https://www.familysearch.org/ark:/61903/1:1:QKWL-NFF5
Record Image: https://www.familysearch.org/ark:/61903/3:1:3Q9M-CSGZ-SS5B-X

Notice that the record type is indicated by the prefix before the ID:

  • 4:1 (tree)
  • 1:1 (record persona)
  • 3:1 (record image)

There are others record types (like Genealogies person). Perhaps Randy Wilson would be able to provide a more comprehensive structure.

I read Jimmy's reply to suggest we can create EXID-TYPE repository entries like

{
    "label": "FSTreePerson",
    "type": "https://www.familysearch.org/ark:/61903/4:1:",
    "description": "FamilySearch Family Tree person identifier",
    "contact": "????@familysearch.org",
    "change-controller": "FamilySearch",
    "reference": "????"
}

Tree would belong with INDI.EXID.

Record Persona would belong either with INDI.EXID or INDI.SOUR.EXID?

Record Image would belong with either a SOUR.EXID or SOUR.OBJE.EXID

It is unclear to me what we should recommend for places, users, and memories.

From Randy Wilson (top Familysearch Engineer)

We do also have persistent IDs (arks) for persons or trees in “Genealogies” (2:...), and for historical records (1:2:...), though we have never had anything link to the record arks.

While it is handy to recognize the prefixes as indicating various kinds of arks, I would strongly advise against writing any code that depends upon them, except within the service itself that is minting those IDs. FamilySearch’s arks are purposely “opaque”, meaning that we don’t want them to communicate what they “mean”, so that we are not tempted to change (break) them later.

For example, we could have had IDs that had collection names embedded in them; or person names as part of them. But then if we split a collection or move a record to a different collection, or edit the person’s name, then it would bother us that the URL is now misleading. So we use meaningless characters so that we are free to update the data, location, service, database, endpoint, etc., without breaking the links.

As one example, if you were to write code that said “if the URL contains ark:/61903/3:1:, then...” in order to handle images, you would have a bug. Some of our images actually start with “3:2:”. DAS is the system that mints the APIDs that these Arks are dependent upon, so it needs to know the difference. Everyone else should know when they are asking for something that should be a digital artifact URL, and then just use it, without parsing it to understand its pieces.

So I wouldn’t be telling developers what the prefixes mean in FamilySearch. What they really are for is to make it so that whichever system is creating new IDs for things can be sure its IDs are globally unique. If you see “/ark:/” in the URL, then you should understand that the organization is committed to trying really hard not to break those links. Other than that, you should already know what type of thing you have a URL for, and then just store and use that URL without trying to understand its syntax.

Discussion 17 MAY 2022: can host the description on gedcom.io like is done for AFN. We probably want to include what we know above in the description text. Then, assuming approval and coordination with the team (Gregg to follow up with them) that owns the api urls in the table up top, we could create a redirect for each category so that an actual URI with suffix goes to actual api content.

Technical question to investigate: Jekyl can redirect to itself, can it do external redirects or do we need another tool?

Approved by the FS API team.

Discussion 14 JUN 2022:

  • Need to explore how to do redirects

Example:
https://gedcom.io/exid-type/FamilySearch-PersonId/ABCD-123 should invoke a script that redirects everything underneath https://gedcom.io/exid-type/FamilySearch-PersonId/ where that one might redirect initially to, say, https://familysearch.org/platform/tree/persons/ABCD-123

@clarkegj to check with Jimmy to see how to do a script to redirect in Jekyl

I think this issue has been resolved by #176 and FamilySearch/GEDCOM.io#43; I'm moving the redirect issue to FamilySearch/GEDCOM.io#55 as it is not longer about the spec or correctness but just about the gedcom.io website.