ubleipzig/intermediateschema

add property for original record identifier

zazi opened this issue · 1 comments

zazi commented

currently, we only have the field "finc.record_id" in intermediate schema for record identifier. this property will be utilised in span-export for a mapping to the field "id" in the finc Solr schema (https://github.com/miku/span/blob/a05d018c9174fd939cbbd9dacb36f80273ea04dd/formats/finc/solr.go#L86), i.e., we cannot store the original record identifier in "finc.record_id" in intermediate schema, since field "id" in finc Solr schema requires a certain pattern to follow (which is [PREFIX]-[SOURCE-ID]-[BASE64-ENCODED-ORIGINAL-RECORD-ID]). so an additional property to store the original record identifier will be required (to enable a mapping from the property to the field "record_id" in finc Solr schema via span-export).
I would propose to utilise the field "finc.record_id" in intermediate schema for the original record identifier and a new field, e.g., "id" or "finc.id" for the processed record identifier that should follow a certain pattern ([PREFIX]-[SOURCE-ID]-[BASE64-ENCODED-ORIGINAL-RECORD-ID]). Hence, we can utilise this new property for a mapping to the field "id" in finc Solr schema via span-export (i.e. all properties would be somehow consistent in both schemata (intermediate schema and finc Solr schema)).

zazi commented

see #8