metafacture/metafacture-core

Counted leader elements in marc when encoding to marc

TobiasNx opened this issue · 11 comments

@maipet hinted that encode-marc21 or encode-marcxml cannot create the leader correctly since the elements are not counted.
Could you elaborate the problem

Is this related to #454?

I am not sure. We are transforming the OERSI JSON Data to Marc, but @maipet told me about invalid results created by the transformation due to the missing leader elements that state e.g. the length of a record.

But @maipet could clarify.

you can set the leader field, but leader "Character Positions 00-04 - Record length" & "Pos. 12-16 - Base address of data" should actually be generated automatically? It was discussed with @dr0i that we should first check whether the marc records from OERSI are 'valid' even without the correct information in the leader (the positions are currently filled with zeros).

While inspecting some workaround for #454, I saw that the marc21-encoder seems to have a mechanism for that:

https://metafacture.org/playground/?flux=%22https%3A//d-nb.info/1106253078/about/marcxml%22%0A%7C+open-http%28accept%3D%22application/xml%22%29%0A%7C+decode-xml%0A%7C+handle-marcxml%0A%7C+fix%28transformationFile%29%0A%7C+encode-marc21%0A%7C+decode-marc21%28emitLeaderAsWhole%3D%22true%22%29%0A%7C+encode-yaml%0A%7C+print%0A%3B&transformation=copy_field%28%22leader%22%2C%22@leader.status%22%29%0Acopy_field%28%22leader%22%2C%22@leader.type%22%29%0Acopy_field%28%22leader%22%2C%22@leader.bibliographicLevel%22%29%0Acopy_field%28%22leader%22%2C%22@leader.typeOfControl%22%29%0Acopy_field%28%22leader%22%2C%22@leader.characterCodingScheme%22%29%0Acopy_field%28%22leader%22%2C%22@leader.encodingLevel%22%29%0Acopy_field%28%22leader%22%2C%22@leader.catalogingForm%22%29%0Acopy_field%28%22leader%22%2C%22@leader.multipartLevel%22%29%0A%0Asubstring%28%22@leader.status%22%2C%225%22%2C%221%22%29%0Asubstring%28%22@leader.type%22%2C%226%22%2C%221%22%29%0Asubstring%28%22@leader.bibliographicLevel%22%2C%227%22%2C%221%22%29%0Asubstring%28%22@leader.typeOfControl%22%2C%228%22%2C%221%22%29%0Asubstring%28%22@leader.characterCodingScheme%22%2C%229%22%2C%221%22%29%0Asubstring%28%22@leader.encodingLevel%22%2C%2217%22%2C%221%22%29%0Asubstring%28%22@leader.catalogingForm%22%2C%2218%22%2C%221%22%29%0Asubstring%28%22@leader.multipartLevel%22%2C%2219%22%2C%221%22%29%0A%0Amove_field%28%22@leader%22%2C%22leader%22%29

Someone more advanced should have a look to confirm. Probably we could reuse the parts of the encode-marc21 for encode-marcxml

dr0i commented

The construction of the leader (counting bytes including indicators etc magic) is done through invoking Marc21Decoder.java (which calls the Record.java) . Code can be reused for encode-marc21 - although it's ugly from a performance point of view (the whole record has to be made into tpye Record at the end of the parsing of a record).
This will be done in my PR treating #454.

Code can be reused for encode-marc21

@dr0i: Isn't encode-marc21 already doing this? See: #524 (comment)

dr0i commented

Functional review @TobiasNx and @maipet .
Deployed to test-Plaground metafacture-framework feature-454-allowMarc21EncoderToGetLeaderAsOneString-SNAPSHOT.

Note that the generated leader is 02934naa a2200649uc 4500 while the original input was
<leader>00000naa a2200000uc 4500</leader>. So the leader seems to be correct (record size and also other parts, while the type etc. is preserved...)

Added my review here: #526 (comment)

On scenario is still not working otherwise for me this seems to work. But @maipet has more knowledge about the leader.

It seems that this is not solved for encode-marcxml. The leader position at the beginning and in the middle are still 00000

Ahhh I now see what the problem here is, encode-marcxml still lacks the ability to generate the counted leader info. I did not review this properly, sorry.

We decided with @maipet and @dr0i that marcXML does not need to count but either use the provided leader info if the leader is provided as whole (even if the record itself changed) or set the Leader Pos 00-04 and 12-16 to zero if the leader is only provided in separated elements as it is done by decode-marc21.

For further info see:
#527 (comment)