The model is not known
Closed this issue · 15 comments
Describe the bug
AlphaPept model is not available.
To Reproduce
(oktoberfest-env) tobiasko@fgcz-c-072:/scratch/cpanse/PXD028735/oktoberfest$ python -m oktoberfest --config_path specLib_config_UP000000625_AlphaPept_ms2_generic
2024-05-08 16:16:06,061 - INFO - oktoberfest.utils.config::read Reading configuration from specLib_config_UP000000625_AlphaPept_ms2_generic
2024-05-08 16:16:06,061 - INFO - oktoberfest.runner::run_job Oktoberfest version 0.6.2
Copyright 2024, Wilhelmlab at Technical University of Munich
2024-05-08 16:16:06,061 - INFO - oktoberfest.runner::run_job Job executed with the following config:
2024-05-08 16:16:06,061 - INFO - oktoberfest.runner::run_job {
"type": "SpectralLibraryGeneration",
"tag": "",
"models": {
"intensity": "AlphaPept_ms2_generic",
"irt": "AlphaPept_rt_generic"
},
"prediction_server": "koina.wilhelmlab.org:443",
"ssl": true,
"output": "/scratch/cpanse/PXD028735/oktoberfest/SpectralLibraryGeneration/UP000000625/AlphaPept_ms2_generic/",
"inputs": {
"library_input": "/scratch/cpanse/PXD028735/fasta/uniprotkb_proteome_UP000000625_2023_07_04.fasta",
"library_input_type": "fasta"
},
"spectralLibraryOptions": {
"fragmentation": "HCD",
"collisionEnergy": 35,
"precursorCharge": [
2,
3
],
"minIntensity": 0.0005,
"batchsize": 10000,
"format": "msp"
},
"fastaDigestOptions": {
"digestion": "full",
"missedCleavages": 0,
"minLength": 7,
"maxLength": 30,
"enzyme": "trypsin",
"specialAas": "KR",
"db": "target"
}
}
2024-05-08 16:16:06,061 - INFO - oktoberfest.utils.config::read Reading configuration from specLib_config_UP000000625_AlphaPept_ms2_generic
2024-05-08 16:16:07,023 - INFO - oktoberfest.preprocessing.preprocessing::process_and_filter_spectra_data No of sequences before filtering is 123274
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /usr/lib/python3.9/runpy.py:197 in _run_module_as_main │
│ │
│ 194 │ main_globals = sys.modules["__main__"].__dict__ │
│ 195 │ if alter_argv: │
│ 196 │ │ sys.argv[0] = mod_spec.origin │
│ ❱ 197 │ return _run_code(code, main_globals, None, │
│ 198 │ │ │ │ │ "__main__", mod_spec) │
│ 199 │
│ 200 def run_module(mod_name, init_globals=None, │
│ │
│ /usr/lib/python3.9/runpy.py:87 in _run_code │
│ │
│ 84 │ │ │ │ │ __loader__ = loader, │
│ 85 │ │ │ │ │ __package__ = pkg_name, │
│ 86 │ │ │ │ │ __spec__ = mod_spec) │
│ ❱ 87 │ exec(code, run_globals) │
│ 88 │ return run_globals │
│ 89 │
│ 90 def _run_module_code(code, init_globals=None, │
│ │
│ /home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/__main__.py:37 in │
│ <module> │
│ │
│ 34 │
│ 35 if __name__ == "__main__": │
│ 36 │ traceback.install() │
│ ❱ 37 │ main() # pragma: no cover │
│ 38 │
│ │
│ /home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/__main__.py:32 in main │
│ │
│ 29 def main(): │
│ 30 │ """Execution of oktoberfest from terminal.""" │
│ 31 │ args = _parse_args() │
│ ❱ 32 │ runner.run_job(args.config_path) │
│ 33 │
│ 34 │
│ 35 if __name__ == "__main__": │
│ │
│ /home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/runner.py:645 in run_job │
│ │
│ 642 │ │
│ 643 │ try: │
│ 644 │ │ if job_type == "SpectralLibraryGeneration": │
│ ❱ 645 │ │ │ generate_spectral_lib(config_path) │
│ 646 │ │ elif job_type == "CollisionEnergyCalibration": │
│ 647 │ │ │ run_ce_calibration(config_path) │
│ 648 │ │ elif job_type == "Rescoring": │
│ │
│ /home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/runner.py:320 in │
│ generate_spectral_lib │
│ │
│ 317 │ config = Config() │
│ 318 │ config.read(config_path) │
│ 319 │ │
│ ❱ 320 │ spec_library = _speclib_from_digestion(config) │
│ 321 │ │
│ 322 │ server_kwargs = { │
│ 323 │ │ "server_url": config.prediction_server, │
│ │
│ /home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/runner.py:239 in │
│ _speclib_from_digestion │
│ │
│ 236 │ data_dir = config.output / "data" │
│ 237 │ if not pp_and_filter_step.is_done(): │
│ 238 │ │ data_dir.mkdir(exist_ok=True) │
│ ❱ 239 │ │ spec_library = pp.process_and_filter_spectra_data( │
│ 240 │ │ │ library=spec_library, model=config.models["intensity"], tmt_label=config.tag │
│ 241 │ │ ) │
│ 242 │ │ spec_library.write_as_hdf5(data_dir / f"{library_file.stem}_filtered.hdf5").join │
│ │
│ /home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/preprocessing/preprocessi │
│ ng.py:221 in process_and_filter_spectra_data │
│ │
│ 218 │ │
│ 219 │ # filter │
│ 220 │ logger.info(f"No of sequences before filtering is {len(library.spectra_data)}") │
│ ❱ 221 │ library.spectra_data = filter_peptides_for_model(library.spectra_data, model) │
│ 222 │ logger.info(f"No of sequences after filtering is {len(library.spectra_data)}") │
│ 223 │ │
│ 224 │ library.spectra_data["MASS"] = library.spectra_data["MODIFIED_SEQUENCE"].apply(lambd │
│ │
│ /home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/preprocessing/preprocessi │
│ ng.py:157 in filter_peptides_for_model │
│ │
│ 154 │ │ │ "max_charge": 6, │
│ 155 │ │ } │
│ 156 │ else: │
│ ❱ 157 │ │ raise ValueError(f"The model {model} is not known.") │
│ 158 │ │
│ 159 │ return filter_peptides(peptides, **filter_kwargs) │
│ 160 │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ValueError: The model AlphaPept_ms2_generic is not known.
WARNING:root:WARNING: Temp mmap arrays were written to /tmp/temp_mmap_8bb_osad. Cleanup of this folder is OS dependant, and might need to be triggered manually! Current space: 38,922,334,208
Expected behavior
According to Koina website available and running.
System [please complete the following information]:
- OS: Debian
- Language Version: Python 3.9.2
- Virtual environment: venv
Additional context
Same for ms2pip_2021_HCD. Prosit_2020_intensity_HCD works.
(oktoberfest-env) tobiasko@fgcz-c-072:/scratch/cpanse/PXD028735/oktoberfest$ python -m oktoberfest --config_path specLib_config_test
2024-05-08 16:24:19,974 - INFO - oktoberfest.utils.config::read Reading configuration from specLib_config_test
2024-05-08 16:24:19,975 - INFO - oktoberfest.runner::run_job Oktoberfest version 0.6.2
Copyright 2024, Wilhelmlab at Technical University of Munich
2024-05-08 16:24:19,975 - INFO - oktoberfest.runner::run_job Job executed with the following config:
2024-05-08 16:24:19,975 - INFO - oktoberfest.runner::run_job {
"type": "SpectralLibraryGeneration",
"tag": "",
"models": {
"intensity": "Prosit_2020_intensity_HCD",
"irt": "Prosit_2019_irt"
},
"prediction_server": "koina.wilhelmlab.org:443",
"ssl": true,
"output": "./",
"inputs": {
"library_input": "/scratch/cpanse/PXD028735/fasta/uniprotkb_proteome_UP000000625_2023_07_04.fasta",
"library_input_type": "fasta"
},
"spectralLibraryOptions": {
"fragmentation": "HCD",
"collisionEnergy": 35,
"precursorCharge": [
2,
3
],
"minIntensity": 0.0005,
"batchsize": 10000,
"format": "msp"
},
"fastaDigestOptions": {
"digestion": "full",
"missedCleavages": 0,
"minLength": 7,
"maxLength": 30,
"enzyme": "trypsin",
"specialAas": "KR",
"db": "target"
}
}
2024-05-08 16:24:19,975 - INFO - oktoberfest.utils.config::read Reading configuration from specLib_config_test
2024-05-08 16:24:21,571 - INFO - oktoberfest.preprocessing.preprocessing::process_and_filter_spectra_data No of sequences before filtering is 123274
2024-05-08 16:24:21,842 - INFO - oktoberfest.preprocessing.preprocessing::process_and_filter_spectra_data No of sequences after filtering is 122826
2024-05-08 16:24:23,517 - INFO - spectrum_io.file.hdf5::write_dataset Data written to data/prosit_input_filtered.hdf5
2024-05-08 16:24:23,525 - INFO - spectrum_io.file.hdf5::write_dataset Data appended to data/prosit_input_filtered.hdf5
2024-05-08 16:24:23,527 - INFO - spectrum_io.file.hdf5::write_dataset Data appended to data/prosit_input_filtered.hdf5
Getting predictions: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████| 13/13 [01:13<00:00, 5.67s/it, failed=0, successful=13]
Writing library: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████| 13/13 [01:13<00:00, 5.67s/it, missing=0, successful=13]
2024-05-08 16:25:37,354 - INFO - oktoberfest.runner::generate_spectral_lib Finished writing the library to disk
WARNING:root:WARNING: Temp mmap arrays were written to /tmp/temp_mmap_caobqeht. Cleanup of this folder is OS dependant, and might need to be triggered manually! Current space: 38,922,338,304
Hi Tobi, yes we are currently in the last stage of integrating this. We have a branch where we switched to a new underlying data structure that supports different models. It should technically be functional, maybe you wanna try it out: https://github.com/wilhelm-lab/oktoberfest/tree/feature/integrate_AnnData
I will merge this asap once I am back from holiday. I still need to add documentation and for alphapept, there was still some issue if the instrument type is not supported, which should be sort of using the next best intrument type.
I tried that branch and get a KeyError: 'X'
This is an unidentified amino acid 'X' in the peptide sequence after digestion. There was still a bug after digestion and filtering those out, which I fixed now. Please try again. In my test case, doing an in-silico digest and subsequent filtering, prediction and spectral library generation works now.
However, we still experience issues with the alphapept model, due to a weird bug when receiving the predictions from koina that is difficult to debug. I will update you once it works.
I have now added the functionality to provide the instrument type via config file. Simply add instrument_type = "QE" or any other supported instrument type within the config's input section. This will add an additional column "intrument_type" to the metadata information file that is created after digestion. This is how it should look like:
"inputs": {
"instrument_type": "QE"
},
...
If you intend to provide peptides instead of performing an in-silico digest, you need to add additional columns to the peptide input now, which are "instrument_types, peptide_length". This is an example of the documentation for this which will be online once the branch is merged:
ok! Will try.
I just added another bugfix for ms2pip, which was due to an inconsistent shape as it only returns +1 ions in a different order compared to how we store it. So that should work as well now in case you stumbled over an error there.
What are your plans for the next release? I asking myself if I would test now on the specific branch, or wait for the next release. What it the status of the branch? Stable/pre-release or work-in-progress?
I won't be able to release a new stable version prior to end of June since I am attending ASMS but I merged lots of stuff in spectrum_fundamentals and spectrum_io, released all of that and merged everything here onto a release branch already. I.e. if you switch to releae/0.7.0, you should find a working version, albeit not fully documented and containing some other open issues I need to deal with before releasing.
Why exactly is instrument_type
key part of the inputs section and not the spectralLibraryOptions
?
2024-05-29 15:10:21,819 - INFO - oktoberfest.utils.config::read Reading configuration from specLib_config_UP000000625_AlphaPept_ms2_generic
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /usr/lib/python3.9/runpy.py:197 in _run_module_as_main │
│ │
│ 194 │ main_globals = sys.modules["__main__"].__dict__ │
│ 195 │ if alter_argv: │
│ 196 │ │ sys.argv[0] = mod_spec.origin │
│ ❱ 197 │ return _run_code(code, main_globals, None, │
│ 198 │ │ │ │ │ "__main__", mod_spec) │
│ 199 │
│ 200 def run_module(mod_name, init_globals=None, │
│ │
│ /usr/lib/python3.9/runpy.py:87 in _run_code │
│ │
│ 84 │ │ │ │ │ __loader__ = loader, │
│ 85 │ │ │ │ │ __package__ = pkg_name, │
│ 86 │ │ │ │ │ __spec__ = mod_spec) │
│ ❱ 87 │ exec(code, run_globals) │
│ 88 │ return run_globals │
│ 89 │
│ 90 def _run_module_code(code, init_globals=None, │
│ │
│ /home/tobiasko/oktoberfest-release070/lib/python3.9/site-packages/oktoberfest/__main__.py:37 in │
│ <module> │
│ │
│ 34 │
│ 35 if __name__ == "__main__": │
│ 36 │ traceback.install() │
│ ❱ 37 │ main() # pragma: no cover │
│ 38 │
│ │
│ /home/tobiasko/oktoberfest-release070/lib/python3.9/site-packages/oktoberfest/__main__.py:32 in │
│ main │
│ │
│ 29 def main(): │
│ 30 │ """Execution of oktoberfest from terminal.""" │
│ 31 │ args = _parse_args() │
│ ❱ 32 │ runner.run_job(args.config_path) │
│ 33 │
│ 34 │
│ 35 if __name__ == "__main__": │
│ │
│ /home/tobiasko/oktoberfest-release070/lib/python3.9/site-packages/oktoberfest/runner.py:635 in │
│ run_job │
│ │
│ 632 │ """ │
│ 633 │ conf = Config() │
│ 634 │ conf.read(config_path) │
│ ❱ 635 │ conf.check() │
│ 636 │ │
│ 637 │ output_folder = conf.output │
│ 638 │ job_type = conf.job_type │
│ │
│ /home/tobiasko/oktoberfest-release070/lib/python3.9/site-packages/oktoberfest/utils/config.py:32 │
│ 6 in check │
│ │
│ 323 │ │ │ │ │ " Please check and use a TMT model instead." │
│ 324 │ │ │ │ ) │
│ 325 │ │ if self.job_type == "SpectralLibraryGeneration": │
│ ❱ 326 │ │ │ self._check_for_speclib() │
│ 327 │ │ │
│ 328 │ │ if "alphapept" in int_model: │
│ 329 │ │ │ instrument_type = self.instrument_type │
│ │
│ /home/tobiasko/oktoberfest-release070/lib/python3.9/site-packages/oktoberfest/utils/config.py:36 │
│ 4 in _check_for_speclib │
│ │
│ 361 │ │ │ instrument_type = self.instrument_type │
│ 362 │ │ │ valid_alphapept_instrument_types = ["QE", "LUMOS", "TIMSTOF", "SCIEXTOF"] │
│ 363 │ │ │ if instrument_type is None: │
│ ❱ 364 │ │ │ │ raise AssertionError( │
│ 365 │ │ │ │ │ f"The chosen intensity model {self.models['intensity']} requires an │
│ 366 │ │ │ │ │ f"Provide one of {valid_alphapept_instrument_types}." │
│ 367 │ │ │ │ ) │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
AssertionError: The chosen intensity model AlphaPept_ms2_generic requires an instrument type. Provide one of ['QE', 'LUMOS', 'TIMSTOF', 'SCIEXTOF'].
WARNING:root:WARNING: Temp mmap arrays were written to /tmp/temp_mmap_b0xpfs7u. Cleanup of this folder is OS dependant, and might need to be triggered manually! Current space: 38,895,894,528
cat specLib_config_UP000000625_AlphaPept_ms2_generic
{
"type": "SpectralLibraryGeneration",
"tag": "",
"models": {
"intensity": "AlphaPept_ms2_generic",
"irt": "AlphaPept_rt_generic"
},
"prediction_server": "koina.wilhelmlab.org:443",
"ssl": true,
"output": "/scratch/cpanse/PXD028735/oktoberfest/SpectralLibraryGeneration/UP000000625/AlphaPept_ms2_generic/",
"inputs": {
"library_input": "/scratch/cpanse/PXD028735/fasta/uniprotkb_proteome_UP000000625_2023_07_04.fasta",
"library_input_type": "fasta",
"instrument_types": "QE"
},
"spectralLibraryOptions": {
"fragmentation": "HCD",
"collisionEnergy": 35,
"precursorCharge": [2,3],
"minIntensity": 5e-4,
"batchsize": 10000,
"format": "msp"
},
"fastaDigestOptions": {
"digestion": "full",
"missedCleavages": 0,
"minLength": 7,
"maxLength": 30,
"enzyme": "trypsin",
"specialAas": "KR",
"db": "target"
}
}
Looks like your code expects instrument_types
to be part of the models
section:
python -m oktoberfest --config_path specLib_config_UP000000625_ms2pip_2021_HCD_Deeplc_hela_hf
2024-05-29 16:10:39,694 - INFO - oktoberfest.utils.config::read Reading configuration from specLib_config_UP000000625_ms2pip_2021_HCD_Deeplc_hela_hf
2024-05-29 16:10:39,695 - INFO - oktoberfest.runner::run_job Oktoberfest version 0.7.0
Copyright 2024, Wilhelmlab at Technical University of Munich
2024-05-29 16:10:39,695 - INFO - oktoberfest.runner::run_job Job executed with the following config:
2024-05-29 16:10:39,695 - INFO - oktoberfest.runner::run_job {
"type": "SpectralLibraryGeneration",
"tag": "",
"models": {
"intensity": "ms2pip_2021_HCD",
"irt": "Deeplc_hela_hf",
"instrument_types": "QE"
},
"prediction_server": "koina.wilhelmlab.org:443",
"ssl": true,
"output": "/scratch/cpanse/PXD028735/oktoberfest/SpectralLibraryGeneration/UP000000625/ms2pip_2021_HCD/",
"inputs": {
"library_input": "/scratch/cpanse/PXD028735/fasta/uniprotkb_proteome_UP000000625_2023_07_04.fasta",
"library_input_type": "fasta"
},
"spectralLibraryOptions": {
"fragmentation": "HCD",
"collisionEnergy": 35,
"precursorCharge": [
2,
3
],
"minIntensity": 0.0005,
"batchsize": 10000,
"format": "msp"
},
"fastaDigestOptions": {
"digestion": "full",
"missedCleavages": 0,
"minLength": 7,
"maxLength": 30,
"enzyme": "trypsin",
"specialAas": "KR",
"db": "target"
}
}
2024-05-29 16:10:39,695 - INFO - oktoberfest.utils.config::read Reading configuration from specLib_config_UP000000625_ms2pip_2021_HCD_Deeplc_hela_hf
2024-05-29 16:10:39,695 - INFO - oktoberfest.utils.process_step::is_done Skipping speclib_digested step because /scratch/cpanse/PXD028735/oktoberfest/SpectralLibraryGeneration/UP000000625/ms2pip_2021_HCD/proc/speclib_digested.done was found.
/home/tobiasko/oktoberfest-release070/lib/python3.9/site-packages/anndata/_core/aligned_df.py:67: ImplicitModificationWarning: Transforming to str index.
warnings.warn("Transforming to str index.", ImplicitModificationWarning)
/home/tobiasko/oktoberfest-release070/lib/python3.9/site-packages/oktoberfest/preprocessing/preprocessing.py:247: ImplicitModificationWarning: Trying to modify attribute `.obs` of view, initializing view as actual.
library.obs["MASS"] = library.obs["MODIFIED_SEQUENCE"].apply(lambda x: compute_peptide_mass(x))
Getting predictions: 100%|████████████████████████████████████████████████████████████████████████| 13/13 [00:46<00:00, 3.56s/it, failed=0, successful=13]
Writing library: 100%|███████████████████████████████████████████████████████████████████████████| 13/13 [00:46<00:00, 3.56s/it, missing=0, successful=13]
2024-05-29 16:11:28,309 - INFO - oktoberfest.runner::generate_spectral_lib Finished writing the library to disk
WARNING:root:WARNING: Temp mmap arrays were written to /tmp/temp_mmap_4gaw6fr8. Cleanup of this folder is OS dependant, and might need to be triggered manually! Current space: 38,895,886,336
...or maybe not!?
python -m oktoberfest --config_path specLib_config_UP000000625_AlphaPept_ms2_generic
2024-05-29 16:16:12,623 - INFO - oktoberfest.utils.config::read Reading configuration from specLib_config_UP000000625_AlphaPept_ms2_generic
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /usr/lib/python3.9/runpy.py:197 in _run_module_as_main │
│ │
│ 194 │ main_globals = sys.modules["__main__"].__dict__ │
│ 195 │ if alter_argv: │
│ 196 │ │ sys.argv[0] = mod_spec.origin │
│ ❱ 197 │ return _run_code(code, main_globals, None, │
│ 198 │ │ │ │ │ "__main__", mod_spec) │
│ 199 │
│ 200 def run_module(mod_name, init_globals=None, │
│ │
│ /usr/lib/python3.9/runpy.py:87 in _run_code │
│ │
│ 84 │ │ │ │ │ __loader__ = loader, │
│ 85 │ │ │ │ │ __package__ = pkg_name, │
│ 86 │ │ │ │ │ __spec__ = mod_spec) │
│ ❱ 87 │ exec(code, run_globals) │
│ 88 │ return run_globals │
│ 89 │
│ 90 def _run_module_code(code, init_globals=None, │
│ │
│ /home/tobiasko/oktoberfest-release070/lib/python3.9/site-packages/oktoberfest/__main__.py:37 in │
│ <module> │
│ │
│ 34 │
│ 35 if __name__ == "__main__": │
│ 36 │ traceback.install() │
│ ❱ 37 │ main() # pragma: no cover │
│ 38 │
│ │
│ /home/tobiasko/oktoberfest-release070/lib/python3.9/site-packages/oktoberfest/__main__.py:32 in │
│ main │
│ │
│ 29 def main(): │
│ 30 │ """Execution of oktoberfest from terminal.""" │
│ 31 │ args = _parse_args() │
│ ❱ 32 │ runner.run_job(args.config_path) │
│ 33 │
│ 34 │
│ 35 if __name__ == "__main__": │
│ │
│ /home/tobiasko/oktoberfest-release070/lib/python3.9/site-packages/oktoberfest/runner.py:635 in │
│ run_job │
│ │
│ 632 │ """ │
│ 633 │ conf = Config() │
│ 634 │ conf.read(config_path) │
│ ❱ 635 │ conf.check() │
│ 636 │ │
│ 637 │ output_folder = conf.output │
│ 638 │ job_type = conf.job_type │
│ │
│ /home/tobiasko/oktoberfest-release070/lib/python3.9/site-packages/oktoberfest/utils/config.py:32 │
│ 6 in check │
│ │
│ 323 │ │ │ │ │ " Please check and use a TMT model instead." │
│ 324 │ │ │ │ ) │
│ 325 │ │ if self.job_type == "SpectralLibraryGeneration": │
│ ❱ 326 │ │ │ self._check_for_speclib() │
│ 327 │ │ │
│ 328 │ │ if "alphapept" in int_model: │
│ 329 │ │ │ instrument_type = self.instrument_type │
│ │
│ /home/tobiasko/oktoberfest-release070/lib/python3.9/site-packages/oktoberfest/utils/config.py:36 │
│ 4 in _check_for_speclib │
│ │
│ 361 │ │ │ instrument_type = self.instrument_type │
│ 362 │ │ │ valid_alphapept_instrument_types = ["QE", "LUMOS", "TIMSTOF", "SCIEXTOF"] │
│ 363 │ │ │ if instrument_type is None: │
│ ❱ 364 │ │ │ │ raise AssertionError( │
│ 365 │ │ │ │ │ f"The chosen intensity model {self.models['intensity']} requires an │
│ 366 │ │ │ │ │ f"Provide one of {valid_alphapept_instrument_types}." │
│ 367 │ │ │ │ ) │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
AssertionError: The chosen intensity model AlphaPept_ms2_generic requires an instrument type. Provide one of ['QE', 'LUMOS', 'TIMSTOF', 'SCIEXTOF'].
WARNING:root:WARNING: Temp mmap arrays were written to /tmp/temp_mmap_o3_r3jn1. Cleanup of this folder is OS dependant, and might need to be triggered manually! Current space: 38,896,148,480
It seems you ran into an unfortunate combination of problems:
-
key name:
It isinstrument_type
, notinstrument_types
(note the extra 's'). The key is therefore not found and since you are doing library generation, the instrument type cannot be read from spectra files either, so the input for alphapept is simply not defined. -
key location:
It is correct that the key needs to be in the input section. In your first attempt, you moved the key to the model section, which worked because you were running with ms2pip, which doesn't require the intrument type as an input. The moment you switched to alphapept, it could again not find the key so it failed.
Please check this comment again:
#216 (comment)
To be fair, in the peptides input annotation in the screenshot, there is a plural version of the key name, but in the example, the column name is correct. My bad, this is confusing of course.
Why exactly is
instrument_type
key part of the inputs section and not thespectralLibraryOptions
?
Because we don't have a spectral library generation section when we do rescoring. In such cases, the key overwrites what oktoberfest reads from the spectra files. This is required, when the spectra are acquired on an unsupported instrument type, because mapping from the spectra file is not trivial... Same goes for the fragmentation method, which is currently mapped from the spectra file but also not necessarily 100% bulletproof.
Is that confusing? Maybe I should allow this key in both locations? I am trying to keep the config as simple as possible...
And your codes also asks for "Provide one of {valid_alphapept_instrument_types}."
. Pretty sure I started with the singular and changed to types
after getting an error. Will try again.
I published a fix for the documentation. Seems like a mixed up peptide_length(s)
and intrument_type(s)
. So the correct way of doing this is peptide_length
and intrument_types
and this is also documented correctly now.
I added 4 new features:
- there is an nrOx argument now, which will create all possible combinations of max n variable M(ox) in each provided peptide
- "library_input_type": "peptides" now only requires a list of peptides and an optional list of proteins per peptide, while the current behaviour is now "library_input_type": "internal" to make life easier when generating spectral libraries from a custom list of peptides without having to manually create all possible combinations of precursor charge, collision energy and M(ox).
- dlib is now tested and supported
- proteinIds are now written into the library file, as long as they are provided, otherwise "unknown" will be used as a placeholder
Please check this out in the new documentation for custom in-silico digests
Another change is that the created library is not called myPrositLib.<msp|dlib|csv>
but rather predicted_library.<msp|dlib|csv>
, because we don't use create this with Prosit any longer.
The issue will be closed with the release of v0.7.0.