/EffectorP-2.0

Improved prediction of fungal effector proteins from secretomes with EffectorP 2.0

Primary LanguagePythonOtherNOASSERTION

What is EffectorP?

Fungal plant pathogens secrete effector proteins that modulate the host cell to facilitate infection. Computational effector candidate identification and subsequent functional characterization delivers valuable insights into plant-pathogen interactions. However, effector prediction in fungi has been challenging due to a lack of unifying sequence features such as conserved N-terminal sequence motifs. Fungal effectors are commonly predicted from secretomes based on criteria such as small size and cysteine-rich, which suffers from poor accuracy.

EffectorP is a machine learning method for fungal effector prediction in secretomes and has been trained to distinguish secreted proteins from secreted effectors in plant-pathogenic fungi. EffectorP improves fungal effector prediction from secretomes based on a robust signal of sequence-derived properties. EffectorP 2.0 achieves an accuracy of 89%, compared with 82% for EffectorP 1.0 and 59.8% for a small size classifier.

What is EffectorP not?

EffectorP is not a tool for secretome prediction.

EffectorP has been trained to find fungal effectors in secretomes, so please run it on a FASTA file of secreted fungal proteins to test if they are predicted effectors. It is recommended to use tools such as SignalP or Phobius to predict first if a protein is likely to be secreted. Alternatively, experimentally determined secretomes instead of computationally predicted secretomes can be submitted to EffectorP.

Running EffectorP

You can submit secreted fungal proteins to the webserver at http://effectorp.csiro.au/.

Alternatively, you can install EffectorP on your machine to run it locally.

All training and evaluation data can be found here.

Installing EffectorP

EffectorP has been written in Python and uses pepstats from the EMBOSS software and the WEKA 3.8.1 software. To get EffectorP to work on your local machine, you need to install the EMBOSS and WEKA softwares from source. Both are already provided in the EffectorP distribution to ensure that compatible versions are used. Effector from version 2.0.1 inclusive uses Python 3.

  1. Download the latest release from this github repo (or alternatively you can clone the github repo and skip step 1).

  2. Make sure EffectorP has the permission to execute. Then unpack EffectorP in your desired location

tar xvf EffectorP-2.0.1.tar.gz
chmod -R 755 EffectorP-2.0.1/
cd EffectorP-2.0.1
  1. For the EMBOSS installation, you need to switch to the Scripts directory and unpack, configure and make. Alternatively, if you are on a computer cluster and EMBOSS is already installed, you can change the variable PEPSTATS_PATH in the EffectorP.py script to the EMBOSS directory that contains pepstats on the machine you are using.
cd Scripts
tar xvf emboss-latest.tar.gz
cd EMBOSS-6.5.7/
./configure
make
cd ../ 
  1. For WEKA, you need to simply unzip the file weka-3-8-1.zip
unzip weka-3-8-1.zip

If you are having troube installing EMBOSS, please see here for help. If you are having troube installing WEKA, please see here for help.

  1. Test if EffectorP is working
python EffectorP.py -i Effector_Testing.fasta

EffectorP output format

Run this to get a feel for the output format:

python EffectorP.py -i Effector_Testing.fasta

# Identifier     Prediction      Probability
CSEP-07 virulence       Effector        0.688
CSEP_09 virulence       Effector        0.842
Mg3LysM Non-effector    0.561
BEC1054 CSEP0064 Blumeria graminis      Effector        0.869
BEC1011 CSEP0264 Blumeria graminis      Effector        0.947
SAD1 jgi|Spore1|2735|sr10077m.01 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4677912/, Sporisorium reilianum   Effector        0.621
SIS1 Rhizophagus irregularis    Effector        0.611
tr|N1JHK8|N1JHK8_BLUG1 CSEP0055 putative effector protein OS=Blumeria graminis f. sp. hordei (strain DH14) GN=BGHDH14_bgh02653 PE=4 SV=1        Effector        0.732
BEC1019 tr|A0A059UDR8|A0A059UDR8_BLUGH Protease-like effector BEC1019 OS=Blumeria graminis f. sp. hordei PE=4 SV=1      Non-effector    0.551
Blumeria Bcg1 EPQ67126.1 hypothetical protein BGT96224_BCG1 [Blumeria graminis f. sp. tritici 96224]    Effector        0.896
CSEP0105        Non-effector    0.595
CSEP0162        Effector        0.693
tr|V5TFR9|V5TFR9_LEPMC Avirulence protein LmJ1 OS=Leptosphaeria maculans GN=AvrLmJ1 PE=4 SV=1   Effector        0.727
tr|A0A0A0S3X0|A0A0A0S3X0_LEPMC Avirulence protein OS=Leptosphaeria maculans GN=AvrLm2 PE=4 SV=1 Effector        0.578
tr|A0A0U2R974|A0A0U2R974_LEPMC AvrLm3 OS=Leptosphaeria maculans GN=AvrLm3 PE=4 SV=1     Effector        0.91
FGSG_10999      Effector        0.865
Ecp7    Effector        0.96

-----------------
Predicted effectors:

CSEP-07 virulence| Effector probability:0.688
CSEP_09 virulence| Effector probability:0.842
BEC1054 CSEP0064 Blumeria graminis| Effector probability:0.869
BEC1011 CSEP0264 Blumeria graminis| Effector probability:0.947
SAD1 jgi|Spore1|2735|sr10077m.01 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4677912/, Sporisorium reilianum| Effector probability:0.621
SIS1 Rhizophagus irregularis| Effector probability:0.611
tr|N1JHK8|N1JHK8_BLUG1 CSEP0055 putative effector protein OS=Blumeria graminis f. sp. hordei (strain DH14) GN=BGHDH14_bgh02653 PE=4 SV=1| Effector probability:0.732
Blumeria Bcg1 EPQ67126.1 hypothetical protein BGT96224_BCG1 [Blumeria graminis f. sp. tritici 96224]| Effector probability:0.896
CSEP0162| Effector probability:0.693
tr|V5TFR9|V5TFR9_LEPMC Avirulence protein LmJ1 OS=Leptosphaeria maculans GN=AvrLmJ1 PE=4 SV=1| Effector probability:0.727
tr|A0A0A0S3X0|A0A0A0S3X0_LEPMC Avirulence protein OS=Leptosphaeria maculans GN=AvrLm2 PE=4 SV=1| Effector probability:0.578
tr|A0A0U2R974|A0A0U2R974_LEPMC AvrLm3 OS=Leptosphaeria maculans GN=AvrLm3 PE=4 SV=1| Effector probability:0.91
FGSG_10999| Effector probability:0.865
Ecp7| Effector probability:0.96
-----------------

Number of proteins that were tested: 17
Number of predicted effectors: 14

-----------------
82.4 percent are predicted to be effectors.
-----------------

EffectorP will return the output as shown in the example above. A summary table will be shown which shows the predictions (effector or non-effector) for each submitted protein.

As a probabilistic classifier (Naive Bayes), EffectorP returns a probability that a tested instance will belong to either the effector or non-effector class and these probabilities are included in the web server output as additional information to researchers. However, in Naive Bayes classification these are known to be only rough estimations and should therefore not be overinterpreted.

We deliberately did not recommend a probability threshold over which a protein would be classified as an effector candidate, as we believe it should remain up to the individual user to interpret their results in the context of additional resources available. For example, a researcher might like to predict the full effector candidate complement using EffectorP and overlay this with in planta expression data to prioritize candidates, whereas in other situations without additional information a list of high-priority candidates as determined by the EffectorP probabilities might be more appropriate.

Citation for EffectorP

Sperschneider J, Dodds PN, Gardiner DM, Singh KB, Taylor JM (2018) Improved prediction of fungal effector proteins from secretomes with EffectorP 2.0. Molecular Plant Pathology. Link to paper