SAGE error with modification not supported
Closed this issue · 8 comments
Description of the bug
25CPTAC_LUAD_W_BI_20180901_KR_f04.mzML 25CPTAC_LUAD_W_BI_20180901_KR_f05.mzML 25CPTAC_LUAD_W_BI_20180901_KR_f14.mzML 25CPTAC_LUAD_W_BI_20180901_KR_f20.mzML 25CPTAC_LUAD_W_BI_20180901_KR_f22.mzML 25CPTAC_LUAD_W_BI_20180901_KR_f23.mzML \
-out out_0_sage.idXML \
-threads 6 \
-database "Homo-sapiens-uniprot-reviewed-contaminants-decoy-202210.fasta" \
-decoy_prefix DECOY_ \
-min_len 6 \
-max_len 40 \
-min_matched_peaks 1 \
-min_peaks 1 \
-max_peaks 500 \
-missed_cleavages 2 \
-report_psms 1 \
-enzyme "Trypsin" \
-precursor_tol_left -20.0 \
-precursor_tol_right 20.0 \
-precursor_tol_unit ppm \
-fragment_tol_left -20.0 \
-fragment_tol_right 20.0 \
-fragment_tol_unit ppm \
-fixed_modifications 'Carbamidomethyl (C)' 'Carbamidomethyl (U)' 'TMT6plex (N-term)' 'TMT6plex (K)' \
-variable_modifications 'Acetyl (Protein N-term)' 'Deamidated (N)' 'Gln->pyro-Glu (N-term Q)' 'Oxidation (M)' 'Pyro-carbamidomethyl (N-term C)' \
-max_variable_mods 3 \
-isotope_error_range 0,1 \
-PeptideIndexing:IL_equivalent \
-PeptideIndexing:unmatched_action warn \
-debug 0 \
\
2>&1 | tee out_0_sage.log
if [[ 625 -ge 2 ]]; then
IDRipper -in out_0_sage.idXML -out . -split_ident_runs
rm out_0_sage.idXML
for f in *.idXML
do
mv "$f" "${f%.*}_sage.idXML"
done
fi
cat <<-END_VERSIONS > versions.yml
"NFCORE_QUANTMS:QUANTMS:TMT:ID:DATABASESEARCHENGINES:SEARCHENGINESAGE":
SageAdapter: $(SageAdapter 2>&1 | grep -E '^Version(.*)' | sed 's/Version: //g' | cut -d ' ' -f 1)
sage: $(sage 2>&1 | grep -E 'Version [0-9]+\.[0-9]+\.[0-9]+')
END_VERSIONS
Command exit status:
8
Command output:
Found Sage version string: Version 0.13.4
Error: Unexpected internal error (the value 'Pyro-carbamidomethyl (N-term C)' was used but is not valid; Modification not found: )
Command wrapper:
Found Sage version string: Version 0.13.4
Error: Unexpected internal error (the value 'Pyro-carbamidomethyl (N-term C)' was used but is not valid; Modification not found: )
Work dir:
/hps/nobackup/juan/pride/reanalysis/differential-expression/tmt/PDC000153/work/39/d70d1337992834b21d39e1913942b5
Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`
-- Check '.nextflow.log' file for details```
### Command used and terminal output
_No response_
### Relevant files
_No response_
### System information
_No response_
Probably just a bug in the adapter. Should be supported.
I have no idea why the modified peptide generator needs to be used here: https://github.com/OpenMS/OpenMS/blob/ecfd8431856a16f04d21d001191294005ebe7745/src/topp/SageAdapter.cpp#L324C22-L324C22
But it's probably the problem.
I think
- we can't assign a static Carbamidomethyl (C) and a variable mod at C
- Pyro-carbamidomethyl (N-term C) - the delta is relative to Cys and not to the Carabamidomethylated one. We can't use both on the same residue.
- It doesn't use the ModifiedPeptideGenerator but just uses a static helper it its namespace (to get details of the fixed mod)
A temporary solution could be to use both as variable modifications. This should give the correct delta masses.
On second thinking - could it be that the actual modification they want is ammonia loss? It seems to lead to Pyro-carbamidomethyl (C)?
<umod:mod title="Ammonia-loss" full_name="Loss of ammonia" username_of_poster="unimod"
<umod:specificity hidden="0" site="C" position="Any N-term" classification="Artefact"
spec_group="3">
<umod:misc_notes>Pyro-carbamidomethyl as a delta from Carbamidomethyl-Cys</umod:misc_notes>
</umod:specificity>
<umod:specificity hidden="1" site="S" position="Protein N-term"
classification="Post-translational"
spec_group="2"/>
<umod:specificity hidden="1" site="T" position="Protein N-term"
classification="Post-translational"
spec_group="1"/>
<umod:specificity hidden="1" site="N" position="Anywhere" classification="Chemical derivative"
spec_group="4">
<umod:misc_notes>N-Succinimide</umod:misc_notes>
</umod:specificity>
Interesting that the same config was pass to COMETAdapter and it works
Correct, that's why I think it's a bug in the "helper" function.
For example, I don't know why one would want to look up by full ID if the residue and terminus is known.
Not sure if that's supported.
https://github.com/OpenMS/OpenMS/blob/ecfd8431856a16f04d21d001191294005ebe7745/src/openms/source/CHEMISTRY/ModifiedPeptideGenerator.cpp#L48C8-L48C8
I will give it a look.
Interesting that the same config was pass to COMETAdapter and it works
If I'm following correctly, I would assume that this is because Comet (I think MSFragger as well) adds the numeric value of variable mods to that of fixed mods - as in it applies a final delta mass of V+F. Sage applies one and only one modification to a residue (they are not additive), so the full delta masses need to be specified for every mod.
For example, comet might expect to have +57 fixed and -17 variable for CAM/pyro-CAM. Sage would expect +57 static and +40 variable, and those are the values that will appear in the modified peptide sequences
This has been solved in the following PR in OpenMS OpenMS/OpenMS#7080