bjornwallner/DockQ

IndexError: list index out of range

Closed this issue · 4 comments

I’m attempting to get the DOCKQ score of a model of CAPRI target #50, from the score_set dataset. The model is named Target50_0000.pdb and the correct crystal structure is named Target50_3r2x.pdb. Both are attached (but with the extention txt added, as pdb files aren't allowed to be uploaded by github) here:

Target50_0000.pdb.txt
Target50_3r2x.pdb.txt

Running

scripts/fix_numbering.pl /path/to/Target50_0000.pdb /path/to/Target50_3r2x.pdb

works fine, but running

python3 DockQ.py /path/to/Target50_0000.pdb.fixed /path/to/Target50_3r2x.pdb -native_chain1 A B -native_chain2 C -model_chain1 A B -model_chain2 C

results in the following error:

Traceback (most recent call last):
  File "/dartfs/rc/lab/G/Grigoryanlab/home/coy/DockQ/DockQ.py", line 732, in <module>
    main()    
  File "/dartfs/rc/lab/G/Grigoryanlab/home/coy/DockQ/DockQ.py", line 510, in main
    model_chains=get_pdb_chains(model)
  File "/dartfs/rc/lab/G/Grigoryanlab/home/coy/DockQ/DockQ.py", line 387, in get_pdb_chains
    pdb_struct = pdb_parser.get_structure("reference", pdb)[0]
  File "/dartfs-hpc/rc/home/4/f002v94/.conda/envs/myenv/lib/python3.9/site-packages/Bio/PDB/PDBParser.py", line 100, in get_structure
    self._parse(lines)
  File "/dartfs-hpc/rc/home/4/f002v94/.conda/envs/myenv/lib/python3.9/site-packages/Bio/PDB/PDBParser.py", line 123, in _parse
    self.trailer = self._parse_coordinates(coords_trailer)
  File "/dartfs-hpc/rc/home/4/f002v94/.conda/envs/myenv/lib/python3.9/site-packages/Bio/PDB/PDBParser.py", line 198, in _parse_coordinates
    resseq = int(line[22:26].split()[0])  # sequence identifier
IndexError: list index out of range

I got the same error too, while comparing the native and predicted

python3 DockQ.py ./complex.1.pdb ./5j13_native.pdb -native_chain1 A B -model_chain1 A B -native_chain2 C -model_chain2 C > decoy1.log

Traceback (most recent call last):
  File "/home/randd/Desktop/Desktop_Office/October2023/ThirdWeek/dockq/DockQ/DockQ.py", line 732, in <module>
    main()    
  File "/home/path/to/dockq/DockQ/DockQ.py", line 648, in main
    info=calc_DockQ(model_fixed,native,use_CA_only)
  File "/home/path/to/dockq/DockQ/DockQ.py", line 137, in calc_DockQ
    sample_structure = pdb_parser.get_structure("model", model)
  File "/home/#####/.local/lib/python3.8/site-packages/Bio/PDB/PDBParser.py", line 100, in get_structure
    self._parse(lines)
  File "/home/#####/.local/lib/python3.8/site-packages/Bio/PDB/PDBParser.py", line 123, in _parse
    self.trailer = self._parse_coordinates(coords_trailer)
  File "/home/#####/.local/lib/python3.8/site-packages/Bio/PDB/PDBParser.py", line 198, in _parse_coordinates
    resseq = int(line[22:26].split()[0])  # sequence identifier
IndexError: list index out of range

I'm also experiencing this issue--it seems the issue is with the file written by the renumbering step. Biopython is unable to parse this pdb file; for me the file looks like this:

ATOM   8808  HZ  PHE B9017X     34.056  40.265  41.472  1.00  0.00           H  
ATOM   8809  N   THR B          37.287  41.477  35.884  1.00  0.00           N  

The second line here, with no resseq entry, causes the problem for Biopython.

Hi, check the new released version of DockQ (v2.0). This works for me now given your attached files. Notice that I had to use the new --allowed_mismatches flag since the two structures don't have identical sequences:

DockQ ~/Downloads/Target50_0000.pdb.txt ~/Downloads/Target50_3r2x.pdb.txt --allowed_mismatches 4
****************************************************************
*                       DockQ                                  *
*   Scoring function for protein-protein docking models        *
*   Statistics on CAPRI data:                                  *
*    0.00 <= DockQ <  0.23 - Incorrect                         *
*    0.23 <= DockQ <  0.49 - Acceptable quality                *
*    0.49 <= DockQ <  0.80 - Medium quality                    *
*            DockQ >= 0.80 - High quality                      *
*   Ref: S. Basu and B. Wallner, DockQ: A quality measure for  *
*   protein-protein docking models                             *
*                            doi:10.1371/journal.pone.0161879  *
*   For comments, please email: bjorn.wallner@.liu.se          *
****************************************************************
Model  : /home/claudio/Downloads/Target50_0000.pdb.txt
Native : /home/claudio/Downloads/Target50_3r2x.pdb.txt
Total DockQ over 3 native interfaces: 0.977
Native chains: A, B
	Model chains: A, B
	DockQ_F1: 0.937
	DockQ: 0.950
	irms: 0.520
	Lrms: 0.883
	fnat: 0.969
Native chains: A, C
	Model chains: A, C
	DockQ_F1: 0.014
	DockQ: 0.014
	irms: 14.961
	Lrms: 47.828
	fnat: 0.000
Native chains: B, C
	Model chains: B, C
	DockQ_F1: 0.013
	DockQ: 0.013
	irms: 16.792
	Lrms: 47.373
	fnat: 0.000

Thanks so much! This update looks great; I appreciate the ability to match the sequence differences now!