inspirehep/inspire-dojson

dojson: more robust external_system_identifiers

Closed this issue · 1 comments

The current implementation of external_system_identifier has some short comings:

  • currently considers only the first 035__a and 035__9 losing any additional value.
  • in case value is not set, it still adds an entry with the schema (producing invalid record).
  • exporting back to MARC associates to $a and $z only based on a whitelist of schemas.

So e.g. in record https://inspirehep.net/record/700376 there:

<datafield tag="035" ind1=" " ind2=" ">
<subfield code="9">OSTI</subfield>
<subfield code="a">892532</subfield>
</datafield>
<datafield tag="035" ind1=" " ind2=" ">
<subfield code="9">OSTI</subfield>
<subfield code="z">897192</subfield>
</datafield>

The second OSTI is currently lost.

We discussed this recently during the content standup. We decided to preserve the hidden identifiers (having a z instead of an a), but put don't display them. So we could follow the same strategy as for texkeys: the first identifier with a given schema is displayed, the others are present but not shown.