/MSFragger

Ultrafast, comprehensive peptide identification for mass spectrometry–based proteomics

Primary LanguageHTML

MSFragger is an ultrafast database search tool for peptide identification in mass spectrometry-based proteomics. It has demonstrated excellent performance across a wide range of datasets and applications. MSFragger is suitable for standard shotgun proteomics analyses as well as large datasets (including timsTOF PASEF data), enzyme unconstrained searches (e.g., peptidome), open database searches (e.g., precursor mass tolerance set to hundreds of Daltons) for identification of modified peptides, and glycopeptide identification (N-linked and O-linked).

MSFragger is implemented in the cross-platform Java programming language and can be used three different ways:

  1. With FragPipe user interface
  2. As a standalone Java executable
  3. Through ProteomeDiscoverer

MSFragger writes peptide-spectrum matches in either tabular or pepXML formats, making it fully compatible with downstream data analysis pipelines such as Trans-Proteomic Pipeline, Percolator, and Philosopher. See the complete documentation, including a list of Frequently Asked Questions. Example parameter files can be found here.

Supported file formats

The following spectral file formats can be searched directly with MSFragger, see the FragPipe homepage for compatibility with workflow components downstream from MSFragger.

  • mzML/mzXML - data from any instrument in mzML/mzXML format can be used

  • Thermo RAW - Thermo raw files (.raw) can be read directly, conversion to mzML is not required. In Linux, Mono need to be installed.

  • Bruker timsTOF PASEF - MSFragger can read Bruker timsTOF PASEF (DDA) raw files (.d) directly, as well as MGF files converted by the Bruker DataAnalysis program. Please note: timsTOF data requires Visual C++ Redistributable for Visual Studio 2017 in Windows. If you see an error saying cannot find Bruker native library, please try to install the Visual C++ redistibutable.

License

The entire MSFragger suite of tools (MSFragger-Core, MSFragger-LOS, MSFragger-Glyco, MSFragger-DIA, MSFragger-Labile), collectively known as "MSFragger", is distributed as a single JAR file. It is available freely for academic research, non-commercial or educational purposes under academic license.

Other uses require a commercial license after the initial 60-day evaluation period that can be obtained by contacting Drew Bennett (andbenne@umich.edu) at the University of Michigan Office of Tech Transfer. For the commercial licensing details (e.g. pricing), please also contact Drew Bennett (andbenne@umich.edu). For other questions, please contact Prof. Alexey Nesvizhskii (nesvi@med.umich.edu).

Download MSFragger

Whether you run use FragPipe, Proteome Discoverer (PD, Thermo Scientific), or the command line, you will need to download the latest MSFragger JAR file. See instructions for downloading or upgrading MSFragger.

Release Notes

Check here for the full list of MSFragger versions and changes.

Running MSFragger

FragPipe

On Windows or Linux, the easiest way to run MSFragger is through FragPipe, which has a variety of built-in workflows for complete data analysis.

ProteomeDiscoverer node

MSFragger and Philosopher (PeptideProphet) are also available as processing nodes in Proteome Discoverer (PD, Thermo Scientific). Currently, the MSFragger-PD node can be used in PD versions 2.2, 2.3 and 2.4.

Command line

See Launching MSFragger on the Wiki page.

Documentation

For technical documentation on MSFragger (hardware requirements, search parameters, etc.), see the MSFragger wiki page.

Questions and Technical Support

See our Frequently Asked Questions (FAQ) page. Please post all questions/bug reports regarding MSFragger itself on the MSFragger GitHub issue page, or if more appropriate on FragPipe page or Philosopher page.

Requests for Collaboration

If you would like to propose a new collaboration that can take advantage of MSFragger and related tools, please contact us directly.

Integration

MSFragger is currently integrated or supported by the following software projects:

How to Cite

  • Kong, A. T., Leprevost, F. V., Avtonomov, D. M., Mellacheruvu, D., & Nesvizhskii, A. I. (2017). MSFragger: ultrafast and comprehensive peptide identification in mass spectrometry–based proteomics. Nature Methods, 14(5), 513-520.
  • Yu, F., Teo, G. C., Kong, A. T., Haynes, S. E., Avtonomov, D. M., Geiszler, D. J., & Nesvizhskii, A. I. (2020). Identification of modified peptides using localization-aware open search. Nature Communications, 11(1), 1-9.
  • Polasky, D. A., Yu, F., Teo, G. C., & Nesvizhskii, A. I. (2020). Fast and Comprehensive N-and O-glycoproteomics analysis with MSFragger-Glyco. Nature Methods, 17(11), 1125-1132.
  • Yu, F., Haynes, S. E., Teo, G. C., Avtonomov, D. M., Polasky, D. A., & Nesvizhskii, A. I. (2020). Fast Quantitative Analysis of timsTOF PASEF Data with MSFragger and IonQuant. Molecular & Cellular Proteomics, 19(9), 1575-1585.
  • Polasky, D., Geiszler, D., Yu, F., Li, K., Teo, G. C., & Nesvizhskii, A. I., (2023). MSFragger-Labile: A flexible method to improve labile PTM analysis in proteomics. Molecular & Cellular Proteomics, 22(5), 100538.
  • Yu, F., Teo, G. C., Kong, A. T., Fröhlich, K., Li, G. X., Demichev, V., & Nesvizhskii, A. I., (2023). Analysis of DIA proteomics data using MSFragger-DIA and FragPipe computational platform. Nature Communications, 14, 4154.

For other tools developed by the Nesvizhskii lab, see our website www.nesvilab.org