keeps/commons-ip

CSIP88 Requirement Failure in Commons IP When No Preservation Metadata Provided

JohannesKarlsen99 opened this issue · 6 comments

Hello,

I am currently using Commons IP to generate E-ARK packages. I have encountered an issue with the CSIP88 requirement. It fails when I provide only metadata/other and no preservation metadata. I am unsure whether this behavior is expected?

{
    "specification" : "CSIP-2.1.0",
    "id" : "CSIP88",
    "name" : "Metadata division",
    "location" : "mets/structMap[@LABEL='CSIP']/div/div[@LABEL='Metadata']",
    "description" : "The metadata referenced in the administrative and/or descriptive metadata section is described in the structural map with one sub division.When the transfer consists of only administrative and/or descriptive metadata this is the only sub division that occurs.",
    "cardinality" : "1..1",
    "level" : "MUST",
    "testing" : {
      "outcome" : "FAILED",
      "issues" : [ "You have metadata files, must add mets/structMap[@LABEL='CSIP']/div/div[@LABEL='Metadata'] in TEST/representations/rep-20240301t024341/METS.xml", "You have metadata files, must add mets/structMap[@LABEL='CSIP']/div/div[@LABEL='Metadata'] in TEST/representations/rep-20240301t023302/METS.xml" ],
      "warnings" : [ ],
      "notes" : [ ]
    }
}

Hello, please send an example where this issue is reproducible.

Here is an example of the output from Common-ip, and this fails the CSIP88 validation.

TESTEARK
├── METS.xml
├── metadata
│   └── descriptive
│       └── MODS.xml
├── representations
│   ├── rep-20240313t080742
│   │   ├── METS.xml
│   │   ├── data
│   │   │   └── WAVEFILE.wav
│   │   └── metadata
│   │       └── other
│   │           ├── Mediainfo
│   │           │   ├── MEDIAINFO_WAVEFILE.wav.json
│   │           │   └── MEDIAINFO_WAVEFILE.wav.xml
│   │           └── Siegfried
│   │               └── SIEGFRIED_WAVEFILE.wav.json
│   └── rep-20240313t080830
│       ├── METS.xml
│       ├── data
│       │   └── WAVFILE2.wav
│       └── metadata
│           └── other
│               ├── Mediainfo
│               │   ├── MEDIAINFO_WAVEFILE2.wav.json
│               │   └── MEDIAINFO_WAVEFILE2.wav.xml
│               └── Siegfried
│                   └── SIEGFRIED_WAVEFILE2.wav.json
└── schemas
    ├── DILCISExtensionMETS.xsd
    ├── DILCISExtensionSIPMETS.xsd
    ├── mediainfo_2_0.xsd
    ├── mets1_12.xsd
    ├── mods-v3-8.xsd
    └── xlink.xsd

Hi @JohannesKarlsen99, the validation report points to an issue in the METS.xml itself, metadata files were found but they were not in the manifest structMap. How have you constructed the SIP and METS.xml? Are you sure the METS file is complete?

i have used commons-ip to genererate the package

Here is the root mets.xml

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<mets xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:sip="https://DILCIS.eu/XML/METS/SIPExtensionMETS" xmlns="http://www.loc.gov/METS/" xmlns:csip="https://DILCIS.eu/XML/METS/CSIPExtensionMETS" xmlns:xlink="http://www.w3.org/1999/xlink" OBJID="TESTEARK" LABEL="test description" TYPE="Textual works – Print" csip:CONTENTINFORMATIONTYPE="OTHER" csip:OTHERCONTENTINFORMATIONTYPE="test" PROFILE="https://earksip.dilcis.eu/profile/E-ARK-SIP.xml" xsi:schemaLocation="http://www.loc.gov/METS/ schemas/mets1_12.xsd http://www.w3.org/1999/xlink schemas/xlink.xsd https://dilcis.eu/XML/METS/CSIPExtensionMETS schemas/DILCISExtensionMETS.xsd https://dilcis.eu/XML/METS/SIPExtensionMETS schemas/DILCISExtensionSIPMETS.xsd">
    <metsHdr CREATEDATE="2024-03-19T10:36:45.386+01:00" LASTMODDATE="2024-03-19T10:36:45.386+01:00" RECORDSTATUS="NEW" csip:OAISPACKAGETYPE="SIP">
        <agent ROLE="CREATOR" TYPE="OTHER" OTHERTYPE="SOFTWARE">
            <name>COMMONS</name>
            <note csip:NOTETYPE="SOFTWARE VERSION">1.0.0</note>
        </agent>
        <agent ROLE="CREATOR" TYPE="ORGANIZATION">
            <name>TEST</name>
            <note csip:NOTETYPE="IDENTIFICATIONCODE"/>
        </agent>
    </metsHdr>
    <dmdSec ID="uuid-B5DB3BBE-1682-4F94-AD82-84A118CC8FBE" CREATED="2024-03-19T10:36:45.388+01:00" STATUS="CURRENT">
        <mdRef ID="ID-uuid-952943F2-3AB8-415E-87BA-D4E96536C45A" LOCTYPE="URL" MDTYPE="MODS" xlink:type="simple" xlink:href="metadata/descriptive/MODS.xml" MIMETYPE="application/xml" SIZE="1610" CREATED="2024-03-19T10:36:45.388+01:00" CHECKSUM="2D50BD85F1244B0C80097EB544922E3C" CHECKSUMTYPE="MD5"/>
    </dmdSec>
    <amdSec ID="uuid-FBFFB79F-F62F-447E-B59E-B3BC3883028F"/>
    <fileSec ID="uuid-F6038BCA-B2B9-4BAB-9C46-066A774661FD">
        <fileGrp ID="uuid-987A9354-260A-424D-85A7-BF4C124A7380" USE="Schemas">
            <file ID="ID-BD62CE9A-AE99-4545-B7E8-3E18FAF41FEB" MIMETYPE="application/octet-stream" SIZE="60075" CREATED="2024-03-19T10:36:45.393+01:00" CHECKSUM="78EB41B2D073B9DE9B797FC1B20D529B" CHECKSUMTYPE="MD5">
                <FLocat xlink:type="simple" xlink:href="schemas/mods-v3-8.xsd" LOCTYPE="URL"/>
            </file>
            <file ID="ID-566B12E0-747B-473C-AE75-3694137DA699" MIMETYPE="application/octet-stream" SIZE="79352" CREATED="2024-03-19T10:36:45.393+01:00" CHECKSUM="834D01BC6666FBF2ACDD703628E69386" CHECKSUMTYPE="MD5">
                <FLocat xlink:type="simple" xlink:href="schemas/mediainfo_2_0.xsd" LOCTYPE="URL"/>
            </file>
            <file ID="ID-8422359C-E9AD-4426-848F-335BD256E370" MIMETYPE="application/octet-stream" SIZE="2038" CREATED="2024-03-19T10:36:45.393+01:00" CHECKSUM="EB72EF8AB5B1C93801DFACBFE6AA8E27" CHECKSUMTYPE="MD5">
                <FLocat xlink:type="simple" xlink:href="schemas/DILCISExtensionMETS.xsd" LOCTYPE="URL"/>
            </file>
            <file ID="ID-B90D124E-8DDE-43FB-8894-E503323503BD" MIMETYPE="application/octet-stream" SIZE="499" CREATED="2024-03-19T10:36:45.393+01:00" CHECKSUM="83DA1FF6F35ADEECE3CCCFB5E2E9F83A" CHECKSUMTYPE="MD5">
                <FLocat xlink:type="simple" xlink:href="schemas/DILCISExtensionSIPMETS.xsd" LOCTYPE="URL"/>
            </file>
            <file ID="ID-69402447-3395-482F-893D-5CACE5D6F757" MIMETYPE="application/octet-stream" SIZE="137125" CREATED="2024-03-19T10:36:45.393+01:00" CHECKSUM="0504DEDC1251E87D7E85F9FF2DBADC0D" CHECKSUMTYPE="MD5">
                <FLocat xlink:type="simple" xlink:href="schemas/mets1_12.xsd" LOCTYPE="URL"/>
            </file>
            <file ID="ID-9787B263-A9B2-437C-903C-900B9B6C9558" MIMETYPE="application/octet-stream" SIZE="3180" CREATED="2024-03-19T10:36:45.393+01:00" CHECKSUM="6BDC7F9459A502964F889D70A335CECE" CHECKSUMTYPE="MD5">
                <FLocat xlink:type="simple" xlink:href="schemas/xlink.xsd" LOCTYPE="URL"/>
            </file>
        </fileGrp>
        <fileGrp ID="uuid-85DDE613-325C-4209-9AE6-3A7382B4CD91" USE="Representations/rep-20240313t080742">
            <file ID="ID-05974261-3D23-4A4E-B43F-20E4D9C437E5" MIMETYPE="application/xml" SIZE="5076" CREATED="2024-03-19T10:36:49.617+01:00" CHECKSUM="22E124325CF7525AE5E13DE0315C2726" CHECKSUMTYPE="MD5">
                <FLocat xlink:type="simple" xlink:href="representations/rep-20240313t080742/METS.xml" LOCTYPE="URL"/>
            </file>
        </fileGrp>
        <fileGrp ID="uuid-3BA80BBA-8981-464E-AC0F-070CD8C047E3" USE="Representations/rep-20240313t080830">
            <file ID="ID-0A97CB41-59EC-4B4D-9978-A67442A30EC0" MIMETYPE="application/xml" SIZE="5624" CREATED="2024-03-19T10:36:53.837+01:00" CHECKSUM="B4E0D130898212384AD146F174F3068A" CHECKSUMTYPE="MD5">
                <FLocat xlink:type="simple" xlink:href="representations/rep-20240313t080830/METS.xml" LOCTYPE="URL"/>
            </file>
        </fileGrp>
    </fileSec>
    <structMap ID="uuid-33AC893C-E59E-4AAD-84BE-BAE36C4CBFE7" TYPE="PHYSICAL" LABEL="CSIP">
        <div ID="uuid-4FC98CEE-F9C2-4F89-94A0-69850F8CCEE9" LABEL="TESTEARK">
            <div ID="uuid-D48E13DF-F41A-4108-BA93-EEC665EC947C" DMDID="uuid-B5DB3BBE-1682-4F94-AD82-84A118CC8FBE" LABEL="Metadata"/>
            <div ID="uuid-A32DBB93-88CA-4089-A633-EFB94CC20979" LABEL="Schemas">
                <fptr FILEID="uuid-987A9354-260A-424D-85A7-BF4C124A7380"/>
            </div>
            <div ID="uuid-7F26BF7E-9B02-44E2-B1A7-EA7BC1F88D50" LABEL="Representations/rep-20240313t080742">
                <mptr xlink:type="simple" xlink:href="representations/rep-20240313t080742/METS.xml" xlink:title="uuid-85DDE613-325C-4209-9AE6-3A7382B4CD91" LOCTYPE="URL"/>
            </div>
            <div ID="uuid-D69274AC-47B2-4537-80AA-6F666693799B" LABEL="Representations/rep-20240313t080830">
                <mptr xlink:type="simple" xlink:href="representations/rep-20240313t080830/METS.xml" xlink:title="uuid-3BA80BBA-8981-464E-AC0F-070CD8C047E3" LOCTYPE="URL"/>
            </div>
        </div>
    </structMap>
</mets>

And here is one of the representation mets.xml

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<mets xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:sip="https://DILCIS.eu/XML/METS/SIPExtensionMETS" xmlns="http://www.loc.gov/METS/" xmlns:csip="https://DILCIS.eu/XML/METS/CSIPExtensionMETS" xmlns:xlink="http://www.w3.org/1999/xlink" OBJID="rep-20240313t080742" LABEL="" TYPE="Textual works – Print" csip:CONTENTINFORMATIONTYPE="OTHER" csip:OTHERCONTENTINFORMATIONTYPE="test" PROFILE="https://earksip.dilcis.eu/profile/E-ARK-SIP.xml" xsi:schemaLocation="http://www.loc.gov/METS/ ../../schemas/mets1_12.xsd http://www.w3.org/1999/xlink ../../schemas/xlink.xsd https://dilcis.eu/XML/METS/CSIPExtensionMETS ../../schemas/DILCISExtensionMETS.xsd https://dilcis.eu/XML/METS/SIPExtensionMETS ../../schemas/DILCISExtensionSIPMETS.xsd">
    <metsHdr CREATEDATE="2024-03-19T10:36:45.389+01:00" LASTMODDATE="2024-03-19T10:36:45.389+01:00" RECORDSTATUS="NEW" csip:OAISPACKAGETYPE="SIP"/>
    <dmdSec ID="uuid-611F6C02-13E5-46D1-9E7A-DEF169AE72E4" CREATED="2024-03-19T10:36:45.390+01:00" STATUS="CURRENT">
        <mdRef ID="ID-uuid-C480B412-3658-4D43-8A78-1AE1DC31BD44" LOCTYPE="URL" MDTYPE="OTHER" OTHERMDTYPE="uuid-C480B412-3658-4D43-8A78-1AE1DC31BD44" xlink:type="simple" xlink:href="metadata/other/Mediainfo/MEDIAINFO_WAVFILE.wav.json" MIMETYPE="application/json" SIZE="4751" CREATED="2024-03-19T10:36:45.390+01:00" CHECKSUM="9608C5D50F94DCBFC197C5C1566F86F7" CHECKSUMTYPE="MD5"/>
    </dmdSec>
    <dmdSec ID="uuid-B2A9556B-CF60-490A-BFE9-4A229710F8C8" CREATED="2024-03-19T10:36:45.390+01:00" STATUS="CURRENT">
        <mdRef ID="ID-uuid-0ED0954E-B3D3-4C82-A3DC-FC18BC3CB0C8" LOCTYPE="URL" MDTYPE="OTHER" OTHERMDTYPE="uuid-0ED0954E-B3D3-4C82-A3DC-FC18BC3CB0C8" xlink:type="simple" xlink:href="metadata/other/Mediainfo/MEDIAINFO_WAVFILE.wav.xml" MIMETYPE="application/xml" SIZE="5999" CREATED="2024-03-19T10:36:45.390+01:00" CHECKSUM="FF23B81185D89A8D761E69764E961190" CHECKSUMTYPE="MD5"/>
    </dmdSec>
    <dmdSec ID="uuid-6ACE9065-2D31-4B27-AC97-2C1BCB18D4A5" CREATED="2024-03-19T10:36:45.391+01:00" STATUS="CURRENT">
        <mdRef ID="ID-uuid-EC55083A-E004-41F9-AD14-AD685C976853" LOCTYPE="URL" MDTYPE="OTHER" OTHERMDTYPE="uuid-EC55083A-E004-41F9-AD14-AD685C976853" xlink:type="simple" xlink:href="metadata/other/Siegfried/SIEGFRIED_WAVFILE.wav.json" MIMETYPE="application/json" SIZE="726" CREATED="2024-03-19T10:36:45.391+01:00" CHECKSUM="F8939FE35F521940ABAD1E59768EE625" CHECKSUMTYPE="MD5"/>
    </dmdSec>
    <amdSec ID="uuid-E1B575CF-9901-424F-8761-605CAA014487"/>
    <fileSec ID="uuid-EFE25DCF-665D-434E-8258-65A2C3176CE2">
        <fileGrp ID="uuid-1927EC8C-E96E-45E5-AB39-E62DC1782B9E" USE="Data">
            <file ID="ID-9D669E45-2D74-4D2B-A607-85067750C8E3" MIMETYPE="application/octet-stream" SIZE="173987064" CREATED="2024-03-19T10:36:45.389+01:00" CHECKSUM="9F1A9015219A0758A18E1E45215AA7AB" CHECKSUMTYPE="MD5">
                <FLocat xlink:type="simple" xlink:href="data/WAVFILE.wav" LOCTYPE="URL"/>
            </file>
        </fileGrp>
    </fileSec>
    <structMap ID="uuid-C9508F07-5A72-4964-8773-EF41843FC02D" TYPE="PHYSICAL" LABEL="CSIP">
        <div ID="uuid-0A8F5EE2-6FAC-49D5-97C4-78F6B8C3FCD9" TYPE="ORIGINAL" LABEL="rep-20240313t080742">
            <div ID="uuid-36FC3E9A-09BE-4D8B-89CA-100C73BB0BD3" DMDID="uuid-A3823D3C-E6AB-4AF0-835E-70DCF4A37E47 uuid-611F6C02-13E5-46D1-9E7A-DEF169AE72E4 uuid-B2A9556B-CF60-490A-BFE9-4A229710F8C8 uuid-9929E83C-8C9C-4E9E-8552-47260A8FE4A9 uuid-6ACE9065-2D31-4B27-AC97-2C1BCB18D4A5" LABEL="Metadata/Other"/>
            <div ID="uuid-F183F686-C2F1-491C-A4BF-223FADCBE44F" LABEL="Data">
                <fptr FILEID="uuid-1927EC8C-E96E-45E5-AB39-E62DC1782B9E"/>
            </div>
        </div>
    </structMap>
</mets>

Thank you, to make our job easier could you provide with SIPs (with mocked data) that can be used to test the issue?

I have uploaded a SIP with mocked data that can be used to reproduce the issue. You can find it in the following link: TESTEARK.zip
Please let me know if there's anything else you need from my end or if further clarification is required regarding the SIP.