/NetphosParser

Parse NetPhos 3.1 results and filter against phospho-peptides

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

NetphosParser

Parse NetPhos 3.1 results and filter against phospho-peptides

Usage

Create Instance

npp = NetphosParser()

Parse NetPhos 3.1 gff results:

npp.parse('netphos_results.txt')

Example NetPhos 3.1 gff results format:

##Type Protein A0A161I596
##Protein A0A161I596
##MAKPLTDQEKRKQISIRGIVGVENVAELKKGFNRHLHFTLVKDRNVATTRDYYFALAHTV
##RDHLVGRWIRTQQYYYEKDPKRTYYLSLEFYMGRTLQNTMINLGLQNACDEAIYQIGLDI
##EELEEMEEDAGLGNGGLGRLAACFLDSMATLGLAAYGYGIRYEYGIFNQKIKDGWQVEEA
##DDWLRHGNPWEKDRPEYMLPIHFYGRVEHHKTGVRWVDTQVVLAMPYDTPVPGYMNNTVN
##TMRLWSARAPNDFNLQDFNVGDYIEAVLDRNLAENISRVLYPNDNFFEGKELRLKQEYFV
##VAASLQDIIRRFKASGLGFKDRIRTGFDSFPEKVAIQLNDTHPALGIPELMRIFLDIEKL
##PWEKAWEITKKTFAYTNHTVLPEALERWPVDLVEKLLPRHLAIIYEINQRHLDRIAALYP
##KDLDRARRMSLIEEDGIKRINMAHLCIVGSHAVNGVAKIHSDIVKNQVFKDFNDMEPDKF
##QNKTNGITPRRWLLLCNPGLAELIAEKIGETYVKDLSQLTKLKKFVDDDVFIRDVSKVKE
##ENKLKFIQYLEKEYKMKLNPASMFDVHVKRIHEYKRQLLNCLHIITMYNRIRENPTKEFV
##PRTVIIGGKAAPGYHMAKMIIKVITAVGDIVNNDPLVGNKLKVIYLENYRVSLAEKVIPA
##TDLSEQISTAGTEASGTGNMKFMLNGALTIGTMDGANVEMAEEAGEENLFIFGMRVEEVA
##EMDKKGYNARDYYEKLPELKKAMDQIQNGFFSPTKPDLFKDIVNMLFNYDRFKVFADYEA
##YVKSQEKVSALYKNPKEWTKVVIKNIAASGMFSSDRTIKEYARDIWGVEPTDLKIAPPNE
##PRNVVDVKAAAPAAKG                                            
##end-Protein
# seqname            source        feature      start   end   score  N/A   ?
# ---------------------------------------------------------------------------
A0A161I596           netphos-3.1b  phos-CKII        6     6   0.551  . .  YES
A0A161I596           netphos-3.1b  phos-unsp       15    15   0.981  . .  YES
...

Read in phospho-peptides of interest and filter out NetPhos results that don't match any peptide

peptides = pd.read_csv('peptides.csv')
npp.filter(peptides)

Headers must be [uniprot,peptide] with # indicating phosphorylation of previous residue.

Example peptide format:

uniprot,orig_peptide,peptide
A0A2G9S9U7,_SSS[+80]VGS[+80]SSSVTASPAGR_.2,SSS#VGS#SSSVTASPAGR
Q98TT3,_SS[+80]IHNFM[+16]THPEFR_.3,SS#IHNFMTHPEFR
Q98TT3,_S[+80]SIHNFM[+16]THPEFR_.3,S#SIHNFMTHPEFR
A0A2G9S9U7,_SS[+80]S[+80]VGSSSSVTASPAGR_.2,SS#S#VGSSSSVTASPAGR

Output filtered results to pandas dataframe

npp.to_df()

Example output:

uniprot,orig_peptide,residue,residue_index,context,kinase,matching_peptides,score
A0A023UDJ8,_VINDNFGIVEGLMTT[+80]VHAYTATQK_.3,T,124,IVEGLMTTVHAYTAT,phos-PKC,VINDNFGIVEGLMTT#VHAYTATQK,0.683
A0A023UDJ8,_VINDNFGIVEGLMTT[+80]VHAYTATQK_.3,T,124,IVEGLMTTVHAYTAT,phos-cdc2,VINDNFGIVEGLMTT#VHAYTATQK,0.453
A0A023UDJ8,_VINDNFGIVEGLMTT[+80]VHAYTATQK_.3,T,124,IVEGLMTTVHAYTAT,phos-CaM-II,VINDNFGIVEGLMTT#VHAYTATQK,0.45
A0A023UDJ8,_VINDNFGIVEGLMTT[+80]VHAYTATQK_.3,T,124,IVEGLMTTVHAYTAT,phos-GSK3,VINDNFGIVEGLMTT#VHAYTATQK,0.438