phbradley/TCRdock

Documentation of output generated by run_prediction.py

Florian411 opened this issue · 2 comments

Hey,

Thanks for providing this tool at this early stage !
I am trying to understand the summary tsv file generated by run_prediction.py.

For PAE values, there are several columns for different chains if I understand the code correctly. What is the correct interpretation of the different predictors and to what part of the protein chain 1, 2 etc. is referring?

Thanks in advance fo clarifiying.

Hi there,
Thanks for trying things out! The 'pae_i_j' values are average inter-chain PAE values. The chains are numbered 0, 1, 2, ... with each chain corresponding to one of the '/'-separated sequences in the 'target_chainseq' column. So for an MHC class 1 ternary complex, chain 0=MHC, chain 1=peptide, chain 2=TCRalpha, chain 3=TCRbeta. Since the AlphaFold PAE matrix is not symmetric, pae_i_j != pae_j_i

If you look at issue #2 below you can see a function that reads these inter-chain PAE values and calculates the average PAE between the pMHC and the TCR:

https://github.com/phbradley/TCRdock/issues/2

Thank you very much! This is super helpful!