/DeMix

DeMix workflow for peptide identification

Primary LanguagePythonMIT LicenseMIT

DeMix: maximizing peptide identification from cofragmentation

I found it would be much easier for general users (normal people who don't work with command line) to go through the whole pipeline under the graphic interface of OpenMS. So I converted my python script into an external tool to integrate in the TOPPAS platform.

Dependent packages:

Python 2.7
OpenMS 2.0
numpy
pandas
lxml
pymzml http://pymzml.github.io/
pyteomics http://pythonhosted.org//pyteomics/
msconvert (ProteoWizard) http://proteowizard.sourceforge.net/index.shtml

MS-GF+ http://omics.pnl.gov/software/ms-gf
Java runtime environment

Procedures:
  1. Copy the DeMix.ttd config file to the OpenMS installation path. (e.g. C:\Program Files\OpenMS-2.0\share\OpenMS\TOOLS\EXTERNAL)
  2. Open the DeMixTOPP.toppas pipeline and change the parameters for the two processes of MSGFAdaptor as well as the wrapper of DeMix, pointing the executable paths of MSGFPlus.jar and the DeMix python script (feature_ms2_clone_TOPP2.py), and also the path to the proteome database in FASTA format.
  3. Load mzML spectra files as the input in the pipeline, then execute the pipeline by pressing F5.
  4. Collect results in the TOPP output folders, including FeatureXML files, text exported feature lists, precursor deconvoluted (cloned) MGF spectra, and the database searching resualt (mzID) from MS-GF+.

Note:

  • Current version works for centroid spectra. If you start with RAW files recorded in profile mode, please picking centroid peaks using the peak_picking_raw.toppas pipeline, or msconvert with its inbuilt peak picking option.

  • Please also make sure that Java and Python runtime as well as all dependent packages are installed. Quick check by executing "python feature_ms2_clone_TOPP2.py" under command line.

  • The default parameters are optimized for Thermo Orbitrap Q-Exactive mass spectrometer (high-resolution: 70,000 MS1 and 17,500 MS2). Change parameters in TOPP if using data from different instrumental settings.

  • Change the modification file under the path of MSGFPlus if searching for different PTMs.

  • Caution: many bug reports related to the database searching with MS-GF+. If program fails while calling the subprocess of MS-GF+, please check the Java setting in your system environment, and manually execute the command for MS-GF+ generated from the script.

Reference

Zhang, B., Pirmoradian, M., Chernobrovkin, A., & Zubarev, R. A. (2014). DeMix Workflow for Efficient Identification of Co-fragmented Peptides in High Resolution Data-dependent Tandem Mass Spectrometry. Molecular & Cellular Proteomics : MCP. doi:10.1074/mcp.O114.038877 http://www.ncbi.nlm.nih.gov/pubmed/25100859