Tanimoto Score Calculation Index out of Bound Error
Opened this issue · 1 comments
I do have a problem with calculating similarity matrices, especially the Tanimoto Score.
I am currently working with Sirius 5.8.6. and I have a working CLI code for this.
I encountered an out of bound error with my similarity calculation some days ago and dont seem to be able to fix it.
I use this command to get the annotations I want and this works just fine. I get all my files in the SIRIUS directory I choose as the output directory.
"C:/Program Files/sirius/sirius.exe" -i "//test_directory/data/feature-data.mgf" -o "//test_directory/data/SIRIUS" config --IsotopeSettings.filter=true --FormulaSearchDB= --Timeout.secondsPerTree=0 --FormulaSettings.enforced=HCNOP --Timeout.secondsPerInstance=0 --AdductSettings.detectable=[[M-H2O+H]+,[M+K]+,[M-H]-,[M+Cl]-,[M+Na]+,[M+H3N+H]+,[M+H]+,[M+Br]-,[M-H2O-H]-,[M-H4O2+H]+] --UseHeuristic.mzToUseHeuristicOnly=650 --AlgorithmProfile=orbitrap --IsotopeMs2Settings=IGNORE --MS2MassDeviation.allowedMassDeviation=5.0ppm --NumberOfCandidatesPerIon=1 --UseHeuristic.mzToUseHeuristic=300 --FormulaSettings.detectable=B,Cl,Br,Se,S --NumberOfCandidates=10 --AdductSettings.enforced=, --AdductSettings.fallback=[[M+K]+,[M+Cl]-,[M-H]-,[M+Na]+,[M+H]+,[M+Br]-] --FormulaResultThreshold=true --InjectElGordoCompounds=true --StructureSearchDB=BIO --RecomputeResults=false formula fingerprint structure canopus write-summaries
For the similarity calculation I use this code:
"C:/Program Files/sirius/sirius.exe" -i "//test_directory/data/SIRIUS" similarity --numpy --tanimoto --tanimoto-canopus -d "//test_directory/data/similarity"
and then encounter this error, which repeats several times with different job numbers and then I get no output (as expected after these errors):
Jul 22, 2024 5:47:13 PM de.unijena.bioinf.jjobs.JJob lambda$logError$2
SEVERE: <27>[JJob-27] Failed!
java.lang.ArrayIndexOutOfBoundsException: Index 3878 out of bounds for length 3878
at de.unijena.bioinf.ChemistryBase.fp.ProbabilityFingerprint$PairwiseIterator.getRightProbability(ProbabilityFingerprint.java:295)
at de.unijena.bioinf.ms.frontend.subtools.similarity.SimilarityMatrixWorkflow.fpcos(SimilarityMatrixWorkflow.java:359)
at de.unijena.bioinf.ms.frontend.subtools.similarity.SimilarityMatrixWorkflow.lambda$tanimoto$9(SimilarityMatrixWorkflow.java:147)
at de.unijena.bioinf.ChemistryBase.math.MatrixUtils$1$1.compute(MatrixUtils.java:537)
at de.unijena.bioinf.jjobs.BasicJJob.call(BasicJJob.java:117)
at de.unijena.bioinf.jjobs.BasicMasterJJob$1.compute(BasicMasterJJob.java:101)
at java.base/java.util.concurrent.RecursiveTask.exec(Unknown Source)
at java.base/java.util.concurrent.ForkJoinTask.doExec(Unknown Source)
at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(Unknown Source)
at java.base/java.util.concurrent.ForkJoinPool.scan(Unknown Source)
at java.base/java.util.concurrent.ForkJoinPool.runWorker(Unknown Source)
at java.base/java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source)
I already tested various things. Like recalculating my .mgf file (has non merged MS/MS data from mzMine that processes my raw data) and recalculated my SIRIUS files. I tried to change the command in different fashions and even tried with the version 6.0.0 (which I didnt manage to get running with my configurations as I wanted and switched back to the previous version)
Do you have any suggestions where the problem may be?
Hey, it is unlikely that we are able to provide further bug fixes for SIRIUS 5. So in general I would recommend switching to SIRIUS 6. Since v6.0.4 a lot of initial bugs and hiccups have been resolved. So it likely that your initial issues have been resolved.
Unfortunately the similarity
tool has not yet been ported to SIRIUS 6 yet. However we are working on it and it will be available in one the the upcoming minor releases.