fordmadox/schematrons

unitdate can also be child of unittitle

Opened this issue · 3 comments

sdm7g commented

Most of our guides have unitdate as a child of unittitle .
This is both valid EAD and accepted/ingested by ArchivesSpace, but your schematron rules are flagging it as missing unitdate.
I'm just learning schematron, but I tried just adding that case to this test and it seems to work properly so far.

<assert 
test="boolean(ead:unitdate//text()[normalize-space()][1]) or ead:unitdate/@normal  or boolean(ead:unittitle/ead:unitdate//text()[normalize-space()][1]) or ead:unittitle/ead:unitdate/@normal  
>You must supply a date at the resource level</assert>

Or more concisely, this also seems to work:

 <assert
test="boolean((.|ead:unittitle)/ead:unitdate//text()[normalize-space()][1]) or (.|ead:unittitle)/ead:unitdate/@normal  "
 >You must supply a date at the resource level</assert>

[ps. did you ever figure out why smart quotes are giving you errors ? ]

I've since updated how the test works for unittitle and/or unitdate. I think that the following should work when the context is the "did" element of any component in the "dsc": https://github.com/fordmadox/schematrons/blob/master/ArchivesSpace-EAD-validator.sch#L80-L83

Also, I never did figure out why smart quotes were throwing errors, but I'd like to get to the bottom of that at some point! For now, I've just taken that rule out of the schematron file. I still get the error on our local instances of ASpace, but when testing on a new installation of version 1.3 on my laptop, I don't have any problems.

Have you found anything else that's caused trouble for importing EAD? Most recently, I discovered that the importer doesn't like EAD-extent encoding that looks like this: <extent @Unit="extent_value">extent_number, so I added a few tests for that.

It sounds like I should still test how those unitdates are tested, though. Hopefully I can get to that tomorrow.

sdm7g commented

The latest validation issue I found:
We must have been using an EAD template at some time that had empty paragraphs to be filled in later:

<scopecontent><p /></scopecontent>

AS Import error message is:

 The following errors were found:      notes/0/subnotes/0/content : Property is required but was missing

I've added that case to my AS prep stylesheet

<!-- ArchivesSpace seems to have trouble with empty anythings   -->    
<xsl:template match="@*[normalize-space()='']" />  <!-- don't copy null attributes -->
<xsl:template match="ead:unitdate[normalize-space()='']" />  <!-- don't copy empty unitdates -->  
<xsl:template match="ead:physloc[normalize-space()='']" />  <!-- don't copy empty physloc -->
<xsl:template match="ead:scopecontent[normalize-space()='']" />  <!-- don't copy empty scopecontent -->

Although, since I wrote the initial version of that stylesheet, I believe they fixed SOME of the AS doesn't like empty elements problems in the importer. I haven't gone back and tested them or tried to test generalizing the rule beyond the elements where I found problems. I fact, on this latest, I'm not sure if it's specifically the empty <p/> element within the scopecontent, or no text in the scopecontent that is the problem. I'll try to give those a test.