/proteome

altorf/proteome: analyzing altorf from proteome perspective

Primary LanguageJupyter Notebook

What's this?

This is a sub project of altorf, analyzing altorf from proteome perspective. By executing peptide identification with various sequence database using MS-GF+, we can find those peptides which is missing from annotation, but actually translated. PNNL library was used as MS/MS data inputs.

Workflow

Follow this flow from top to bottom. For further information, please refer to README on each sub directories.

  1. createcatalog

    • Organizes scattered information about the 112 species in PNNL library.
  2. pickdatasets

    • Picks datasets to analyze, considering MS/MS type and the number of identified peptides.
  3. downloadpnnl

    • Downloads picked .mzML, .mzid & .fasta files from PNNL library ftp.
  4. extractparam

    • Extracts information from each .mzids on the MS-GF+ configuration for peptide identification.
  5. createsequence

    • Creates various sequences database for peptide identification.
  6. exec

    • Executes peptide identification using MS-GF+.
  7. processresult

Prerequisites

  • MS-GF+
    • performs peptide identification
    • requires >=JRE 1.6 and Main maemory >=2GB
  • Anaconda (ver 3.X)
  • BioPython