[BUG] datahub validation fails for v1.0-pre2
Opened this issue · 14 comments
Describe the bug
The DataHUB validation pipeline fails for arcs created with https://github.com/nfdi4plants/ARCCommander/releases/download/v1.0.0-preview.2/arc_osx-x64
This is before running "metadata tests".
To Reproduce
arc_osx-x64 init
arc_osx-x64 assay add -s v1pre2-Study -a v1pre2-Assay
arc_osx-x64 i person register --lastname LastName --firstname FirstName --email email@nfdi4plants.org --affiliation DataPLANT
arc_osx-x64 investigation update -i v1pre2 --description "Description v1pre2" --title "Title v1pre2"
arc_osx-x64 sync -f -r https://git.nfdi4plants.org/<userName>/v1pre2 -m "v1pre2 test"
This does not happen with ARC commander v0.0.5 using the same commands above.
This does not happen with ARC commander v0.0.5 using the same commands above.
Ah okay I see. Then it is probably a mismatch between ARCCommander and validation pipeline version, as @omaus suggested.
The tests failed in a way, that no xml file was created. Maybe we could still get some mechanic for retreiving the reason for this in future cases, e.g. wrapping the complete pipeline call into a try .. with
?
Here is the relevant output of the arc-validate command:
$ bash /opt/arc-validate/arc-validate.sh; ret=$?
+ arc-validate
Internal Error:
Cannot modify readonly container
" at System.IO.Packaging.Package.ThrowIfReadOnly()
at System.IO.Packaging.Package.CreatePart(Uri partUri, String contentType, CompressionOption compressionOption)
at DocumentFormat.OpenXml.Packaging.OpenXmlPackage.CreateMetroPart(Uri partUri, String contentType)
at DocumentFormat.OpenXml.Packaging.OpenXmlPart.CreateInternal(OpenXmlPackage openXmlPackage, OpenXmlPart parent, String contentType, String targetExt)
at DocumentFormat.OpenXml.Packaging.OpenXmlPartContainer.InitPart[T](T newPart, String contentType, String id)
at DocumentFormat.OpenXml.Packaging.OpenXmlPartContainer.InitPart[T](T newPart, String contentType)
at DocumentFormat.OpenXml.Packaging.OpenXmlPartContainer.AddNewPartInternal[T]()
at DocumentFormat.OpenXml.Packaging.OpenXmlPartContainer.AddNewPart[T]()
at FsSpreadsheet.ExcelIO.Spreadsheet.getOrInitSharedStringTablePart(SpreadsheetDocument spreadsheetDocument)
at FsSpreadsheet.ExcelIO.Spreadsheet.getCellsBySheet(Sheet sheet, SpreadsheetDocument spreadsheetDocument)
at FsSpreadsheet.ExcelIO.Spreadsheet.getCellsBySheetID(String sheetID, SpreadsheetDocument spreadsheetDocument)
at FsSpreadsheet.ExcelIO.FsExtensions.sheets@182.Invoke(Sheet xlsxSheet)
at Microsoft.FSharp.Collections.Internal.IEnumerator.map@99.DoMoveNext(b& curr) in D:\a\_work\1\s\src\FSharp.Core\seq.fs:line 102
at Microsoft.FSharp.Collections.Internal.IEnumerator.MapEnumerator`1.System.Collections.IEnumerator.MoveNext() in D:\a\_work\1\s\src\FSharp.Core\seq.fs:line 84
at Microsoft.FSharp.Collections.SeqModule.Fold[T,TState](FSharpFunc`2 folder, TState state, IEnumerable`1 source) in D:\a\_work\1\s\src\FSharp.Core\seq.fs:line 872
at FsSpreadsheet.ExcelIO.FsExtensions.FsWorkbook.fromXlsxFile.Static(String filePath)
at ArcValidation.Configs.ArcConfig.get_InvestigationStudies() in /opt/arc-validate/src/ArcValidation/Configs/ArcConfig.fs:line 27
at ArcValidation.Configs.ArcConfig.get_StudyPathsAndIds() in /opt/arc-validate/src/ArcValidation/Configs/ArcConfig.fs:line 33
at ArcValidation.TestGeneration.Critical.Arc.FileSystem.generateArcFileSystemTests(ArcConfig arcConfig) in /opt/arc-validate/src/ArcValidation/TestGeneration/Critical/ArcFileSystem.fs:line 18
at ARCValidate.main(String[] argv) in /opt/arc-validate/src/arc-validate/Program.fs:line [29](https://git.nfdi4plants.org/<redacted>/v1pre2/-/jobs/2454#L29)"
Resulting in another error later on, since arc-validate
did not create the arc-validate-results.xml
:
$ /opt/arc-validate/create-badge.py
Traceback (most recent call last):
File "/opt/arc-validate/create-badge.py", line 9, in <module>
xml = JUnitXml.fromfile(xml_path)
File "/usr/local/lib/python3.9/dist-packages/junitparser/junitparser.py", line 751, in fromfile
tree = etree.parse(filepath) # nosec
File "/usr/lib/python3.9/xml/etree/ElementTree.py", line 1229, in parse
tree.parse(source, parser)
File "/usr/lib/python3.9/xml/etree/ElementTree.py", line [56](https://git.nfdi4plants.org/<redacted>/v1pre2/-/jobs/2454#L56)9, in parse
source = open(source, "rb")
FileNotFoundError: [Errno 2] No such file or directory: './arc-validate-results.xml'
I thought the arc-validate
tool is supposed to always create that XML file in all cases, isn't it?
Here is the relevant output of the arc-validate command:
$ bash /opt/arc-validate/arc-validate.sh; ret=$? + arc-validate Internal Error: Cannot modify readonly container " at System.IO.Packaging.Package.ThrowIfReadOnly() at System.IO.Packaging.Package.CreatePart(Uri partUri, String contentType, CompressionOption compressionOption) at DocumentFormat.OpenXml.Packaging.OpenXmlPackage.CreateMetroPart(Uri partUri, String contentType) at DocumentFormat.OpenXml.Packaging.OpenXmlPart.CreateInternal(OpenXmlPackage openXmlPackage, OpenXmlPart parent, String contentType, String targetExt) at DocumentFormat.OpenXml.Packaging.OpenXmlPartContainer.InitPart[T](T newPart, String contentType, String id) at DocumentFormat.OpenXml.Packaging.OpenXmlPartContainer.InitPart[T](T newPart, String contentType) at DocumentFormat.OpenXml.Packaging.OpenXmlPartContainer.AddNewPartInternal[T]() at DocumentFormat.OpenXml.Packaging.OpenXmlPartContainer.AddNewPart[T]() at FsSpreadsheet.ExcelIO.Spreadsheet.getOrInitSharedStringTablePart(SpreadsheetDocument spreadsheetDocument) at FsSpreadsheet.ExcelIO.Spreadsheet.getCellsBySheet(Sheet sheet, SpreadsheetDocument spreadsheetDocument) at FsSpreadsheet.ExcelIO.Spreadsheet.getCellsBySheetID(String sheetID, SpreadsheetDocument spreadsheetDocument) at FsSpreadsheet.ExcelIO.FsExtensions.sheets@182.Invoke(Sheet xlsxSheet) at Microsoft.FSharp.Collections.Internal.IEnumerator.map@99.DoMoveNext(b& curr) in D:\a\_work\1\s\src\FSharp.Core\seq.fs:line 102 at Microsoft.FSharp.Collections.Internal.IEnumerator.MapEnumerator`1.System.Collections.IEnumerator.MoveNext() in D:\a\_work\1\s\src\FSharp.Core\seq.fs:line 84 at Microsoft.FSharp.Collections.SeqModule.Fold[T,TState](FSharpFunc`2 folder, TState state, IEnumerable`1 source) in D:\a\_work\1\s\src\FSharp.Core\seq.fs:line 872 at FsSpreadsheet.ExcelIO.FsExtensions.FsWorkbook.fromXlsxFile.Static(String filePath) at ArcValidation.Configs.ArcConfig.get_InvestigationStudies() in /opt/arc-validate/src/ArcValidation/Configs/ArcConfig.fs:line 27 at ArcValidation.Configs.ArcConfig.get_StudyPathsAndIds() in /opt/arc-validate/src/ArcValidation/Configs/ArcConfig.fs:line 33 at ArcValidation.TestGeneration.Critical.Arc.FileSystem.generateArcFileSystemTests(ArcConfig arcConfig) in /opt/arc-validate/src/ArcValidation/TestGeneration/Critical/ArcFileSystem.fs:line 18 at ARCValidate.main(String[] argv) in /opt/arc-validate/src/arc-validate/Program.fs:line [29](https://git.nfdi4plants.org/<redacted>/v1pre2/-/jobs/2454#L29)"
Looks 2 me like the Investigation file is read-only. Could you chick this @Brilator. Might also be any of the Study files...
I thought the
arc-validate
tool is supposed to always create that XML file in all cases, isn't it?
'xactly.
If read-only is the cause of this, I'll keep it in mind for arc-validate V2.
@omaus can you do me a favor and check this with latest arc commander on windows using the commands above?
If it is read-only, that’s still an arc commander bug.
@omaus can you do me a favor and check this with latest arc commander on windows using the commands above?
If it is read-only, that’s still an arc commander bug.
Not read-only @ Windows.
Then that's not the reason. Or did validation work?
I created a test repo in Gitlab with the commands from above. None of the XLSX files had read-only, yet the pipeline did not work and prints the same error as above.
@HLWeil Any ideas? Might be sth. with a newer FsSpreadsheet version and some alterations in reading XLSX files.
Still relevant for ARC Commander v1
Run the following to create a minimal ARC that should be valid for invenio.
mkdir arc-v1-test; cd arc-v1-test
arc init
arc assay add -s v1-test-Study -a v1-test-Assay
arc i person register --lastname TestLastName --firstname TestFirstName --email testmail@nfdi4plants.org --affiliation DataPLANT
arc i update -i v1-test --description "Description v1-test" --title "Title v1-test"
arc export
arc a list
arc s list
arc sync -f -r https://git.nfdi4plants.org/<>/v1-test -m "v1-test"
Fails during validate ARC with
Running with gitlab-runner 16.2.1 (674e0e29)
on dataplant-runner-0 iAYwqpK5, system ID: r_RntxNI6dNOlh
Preparing the "docker" executor
00:02
Using Docker executor with image ghcr.io/nfdi4plants/arc-validate:main ...
Pulling docker image ghcr.io/nfdi4plants/arc-validate:main ...
Using docker image sha256:31c612d8a4cbd25d26e1ca5263e9699ecb41495a7b9014d96da9c176136b2f0f for ghcr.io/nfdi4plants/arc-validate:main with digest ghcr.io/nfdi4plants/arc-validate@sha256:56352f8074174962e89e6b6367e74901e705092bbe9322057c5772d6d5fca1bf ...
Preparing environment
00:00
Running on runner-iaywqpk5-project-1044-concurrent-0 via 8764d0667e17...
Getting source from Git repository
00:01
Fetching changes with git depth set to 20...
Reinitialized existing Git repository in /builds/brilator/v1-test/.git/
Checking out b21e357a as detached HEAD (ref is main)...
Removing arc-summary.md
Removing arc.json
Skipping Git submodules setup
Downloading artifacts
00:02
Downloading artifacts for create ARC JSON (3489)...
Downloading artifacts from coordinator... ok host=s3.bwsfs.uni-freiburg.de id=3489 responseStatus=200 OK token=64_tMrK_
Executing "step_script" stage of the job script
00:00
Using docker image sha256:31c612d8a4cbd25d26e1ca5263e9699ecb41495a7b9014d96da9c176136b2f0f for ghcr.io/nfdi4plants/arc-validate:main with digest ghcr.io/nfdi4plants/arc-validate@sha256:56352f8074174962e89e6b6367e74901e705092bbe9322057c5772d6d5fca1bf ...
$ echo "Running unit tests... "
Running unit tests...
$ set +e
$ bash /opt/arc-validate/arc-validate.sh; ret=$?
+ arc-validate
arc-validate failed due to an internal error.
This error did likely NOT occur due to user input.
An empty test result file will be created to reflect this and prevent the validation pipeline from failing.
Run arc-validate with --verbose to see the full error message.
[11:30:14 ERR] arc-validate.arc-validate failed in 00:00:00.0050000.
arc-validate failed due to an internal error
This error did likely NOT occur due to user input.
An empty test result file will be created to reflect this and prevent the subsequent validation pipeline from failing.
. Actual value was true but had expected it to be false.
at ARCValidate.createInternalFailDummyTestResults@13.Invoke(Unit _arg1) in /opt/arc-validate/src/arc-validate/Program.fs:line 14
at Expecto.Impl.execTestAsync@569-1.Invoke(Unit unitVar)
at Microsoft.FSharp.Control.AsyncPrimitives.CallThenInvoke[T,TResult](AsyncActivation`1 ctxt, TResult result1, FSharpFunc`2 part2) in D:\a\_work\1\s\src\FSharp.Core\async.fs:line 508
at Microsoft.FSharp.Control.Trampoline.Execute(FSharpFunc`2 firstAction) in D:\a\_work\1\s\src\FSharp.Core\async.fs:line 112 <Expecto>
$ echo "$ret"
3
$ set -e
$ /opt/arc-validate/create-badge.py
$ exit "$ret"
Uploading artifacts for failed job
00:09
Uploading artifacts...
arc-validate-results.xml: found 1 matching artifact files and directories
arc-quality.svg: found 1 matching artifact files and directories
Uploading artifacts as "archive" to coordinator... 201 Created id=3490 responseStatus=201 Created token=64_tMrK_
Uploading artifacts...
arc-validate-results.xml: found 1 matching artifact files and directories
Uploading artifacts as "junit" to coordinator... 201 Created id=3490 responseStatus=201 Created token=64_tMrK_
Cleaning up project directory and file based variables
00:00
ERROR: Job failed: exit code 3
Hmm not sure whether the validation pipeline is already rolled out for ARC v1.x.x.
The new package based validation pipelines are not rolled out yet. I think it is the easiest to just ignore these errors until we can move forward next week