Error reading mzML file: "Failed to resolve reference"
jorainer opened this issue · 1 comments
jorainer commented
I stumbled across a problem to read mzML files from massive:
library(curl)
url <- "ftp://massive.ucsd.edu/MSV000087155/ccms_peak/New_mzMLFinal/20160603151123624-1576262 Batch5_SHP77_2a.mzML"
fl <- paste0(tempdir(), "/test.mzML")
curl_download(sub(" ", "%20", url, fixed = TRUE), fl)
Now, mzR
has an issue reading this file:
library(mzR)
o <- openMSfile(fl)
Error: Can not open file /tmp/RtmpC3pdCi/test.mzML! Original error was: Error: [References::resolve()] Failed to resolve reference.
object type: N4pwiz6msdata23InstrumentConfigurationE
reference id: IC1
referent list: 0
The issue is that the "defaultInstrumentConfigurationRef"
is not referenced/available (line 39 below):
> readLines(fl, n = 40)
[1] "<?xml version=\"1.0\" encoding=\"utf-8\"?>"
[2] "<indexedmzML xmlns=\"https://www.psidev.info/mzML\""
[3] "xmlns:xsi=\"https://www.w3.org/2001/XMLSchema-instance\""
[4] "xsi:schemaLocation=\"http://www.psidev.info/mzML http://psidev.info/files/ms/mzML/xsd/mzML1.1.2_idx.xsd\">"
[5] "<mzML xmlns=\"http://www.psidev.info/mzML\""
[6] "xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\""
[7] "id=\"20160603151123624-1576262 Batch5_SHP77_2\""
[8] "version=\"1.1.0\""
[9] "xsi:schemaLocation=\"http://www.psidev.info/mzML http://psidev.info/files/ms/mzML/xsd/mzML1.1.0.xsd\">"
[10] "<cvList count=\"1\">"
[11] "<cv URI=\"http://psidev.cvs.sourceforge.net/*checkout*/psidev/psi/psi-ms/mzML/controlledVocabulary/psi-ms.obo\""
[12] "fullName=\"Proteomics Standards Initiative Mass Spectrometry Ontology\""
[13] "id=\"MS\""
[14] "version=\"3.79.0\"/>"
[15] "</cvList>"
[16] "<fileDescription>"
[17] "<fileContent>"
[18] "<cvParam accession=\"MS:1000579\" cvRef=\"MS\" name=\"MS1 spectrum\" value=\"\"/>"
[19] "<cvParam accession=\"MS:1000128\" cvRef=\"MS\" name=\"profile spectrum\" value=\"\"/>"
[20] "</fileContent>"
[21] "</fileDescription>"
[22] "<referenceableParamGroupList count=\"1\">"
[23] "<referenceableParamGroup id=\"CommonInstrumentParams\">"
[24] "<cvParam accession=\"MS:1000490\" cvRef=\"MS\" name=\"Agilent instrument model\" value=\"\"/>"
[25] "<userParam name=\"instrument model\" value=\"QTOF\"/>"
[26] "</referenceableParamGroup>"
[27] "</referenceableParamGroupList>"
[28] "<softwareList count=\"3\">"
[29] "<software id=\"MassHunter\" version=\"2.2\">"
[30] "<cvParam accession=\"MS:1000678\" cvRef=\"MS\" name=\"MassHunter Data Acquisition\" value=\"\"/>"
[31] "</software>"
[32] "<software id=\"pwiz\" version=\"3.0.9248\">"
[33] "<cvParam accession=\"MS:1000615\" cvRef=\"MS\" name=\"ProteoWizard software\" value=\"\"/>"
[34] "</software>"
[35] "<software id=\"fiaMiner\" version=\"1819\">"
[36] "<cvParam accession=\"MS:1000531\" cvRef=\"MS\" name=\"software\" value=\"\"/>"
[37] "</software>"
[38] "</softwareList>"
[39] "<run defaultInstrumentConfigurationRef=\"IC1\" defaultSourceFileRef=\"MSScan.bin\""
[40] "id=\"20160603151123624-1576262 Batch5_SHP77_2\">"
What would be the best solution to handle these things? Add a parameter that disables checking for references?
jorainer commented
had a look into the proteowizard code and seems there is no option to disable checks during file reading.