dfermin/lucXor

Filtered mzML crashes due to scan numbers?

Opened this issue · 6 comments

Hi, I have a filtered mzML file (Thermo) for testing, wich contains scan numbers 10000-10500. When I run luciphor, it crashes quickly and complains as follows:

  spectra.mzML:  Exception in thread "main" java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at com.simontuffs.onejar.Boot.run(Boot.java:340)
        at com.simontuffs.onejar.Boot.main(Boot.java:166)
  Caused by: umich.ms.fileio.exceptions.FileParsingException: No such scan number found in the index [10001]
        at umich.ms.fileio.filetypes.xmlbased.AbstractXMLBasedDataSource.parseScan(AbstractXMLBasedDataSource.java:671)
        at lucxor.globals.read_mzML(globals.java:685)
        at lucxor.globals.read_in_spectra(globals.java:655)
        at lucxor.LucXor.main(LucXor.java:70)
        ... 6 more

I suspect it has something to do with the scans being filtered (msconvert --filter "scanNumber 10000-10500" ... ), but I don't really know if this is the problem. If anyone has a hunch, I'd like to hear it. :)

That would be my first guess looking at that error.
What happens if you don't filter the mzML file? Does it run as expected?

I tested with filtering the first 500 scans (which are crap), and then it doesnt crash on reading the spectra. (I can run the full file, but that'll take a while)

It might be that it's barfing because the new mzML file starts at 10000 instead of 1. See if you can run it with the first 1000 or 5000 scans.

Yeah, it's just that I'm trying to keep small test data (first 1000 scans is also giving really bad PSMs), so I think I'll try to re-number the scans instead, if I manage to do so by regex or something.

Could this be issue with an underlying mzML reading library then?

That's what I suspect based on the error message and what you're telling me here.

Ok, thanks for the quick replying! Will try to re-number.