-
using the Sample_dir provided by Prof.Haiqing Zhao to caculate the 6 metrics of selected 100 bacteria PDB dimers, (50 from bacteria PDB heterodimers, and 50 from bacteria PDB homodimers),coding
- mutual information (MI),
- conservation (Con) and
- direct coupling (DCA).
- average product correction (APC) of MI
- average product correction (APC) of Con
- average product correction (APC) of DCA
-
Test the caculation on CAPRI benchmark decoys-score_set
-
Caculate interface residues : saved in xxx.ifc_seq file
‘The iRMSD is the Cα RMSD of interface residues with respect to the native structure. Interface residues in a complex are defined as all the residues within 10.0 Å from any residues of the other subunit.’——Wang, X., Flannery, S. T., & Kihara, D. (2021). Protein Docking Model Evaluation by Graph Neural Networks. Frontiers in Molecular Biosciences, 8, 647915. https://doi.org/10.3389/fmolb.2021.647915
-
The msa file-paired based on the shared common species.:save in xxx.msa
To avoid biased sequence sampling due to over-studied model species, we carried out homolog sequence search on 5,090 representative proteomes that were carefully curated and selected in EggNog 5.0.
/home/zjlab/score_set/e5.proteomes.faa:
http://eggnog5.embl.de/download/eggnog_5.0/e5.proteomes.faa/home/zjlab/score_set/Q03774.fasta :
https://www.ebi.ac.uk/pdbe/pdbe-kb/proteins/Q03774jackhmmer -N 5 --incE 0.001 --tblout output.tblout /home/zjlab/score_set/Q03774.fasta /home/zjlab/score_set/e5.proteomes.faa > output.sto
-
Use the two above to caculate 6 metrics
-
compare the caculated 6 metrics to the ZIPPI paper -SI Table S2. Detailed AUROC performances of ZIPPI metrics for each CAPRI Target
-