Patch fix_partial_contigs when residue numbering in PDB has a gap
data2code opened this issue · 2 comments
data2code commented
In rf/utils.py, around line 78, should the 3 lines be added, in case the residue numbering in the original PDB file has a gap? Thanks!
if L > 0:
new_contig.append(f"{L}-{L}")
unseen = []
### in case residue numbering jumps
elif len(seen)>0 and seen[-1][1]!=i-1:
new_contig.append(f"{seen[0][0]}{seen[0][1]}-{seen[-1][1]}")
seen = []
###
seen.append([c,i])
sokrypton commented
Thanks! Do you have an example input where this change is needed?
data2code commented
Using the following toy example, the output from fix_partial_contigs becomes A1-44, and this then leads to incorrect output for fix_pdb, (chain E should have been renamed to B, but it is renamed to chain A by mistake).
If you change fix_partial_contigs to fix_contigs, the behavior is correct.
from inference.utils import parse_pdb
from colabdesign.rf.utils import fix_contigs, fix_partial_contigs, fix_pdb
parsed_pdb = parse_pdb('1crn.pdb')
pdb_str=open(f"1crn.pdb").read()
contigs = fix_partial_contigs(['A1-7/A10-44', 'E'], parsed_pdb)
print(contigs)
print("\n".join(fix_pdb(pdb_str, contigs).split("\n")[-6:]))
output:
['A1-44', 'E45-46']
ATOM 323 CB ASN A 44 12.266 4.769 13.501 1.00 7.27 A C
ATOM 324 CG ASN A 44 12.538 4.304 14.922 1.00 7.98 A C
ATOM 325 ND2 ASN A 44 13.407 3.298 15.015 1.00 10.32 A N
ATOM 326 OD1 ASN A 44 11.982 4.849 15.886 1.00 11.00 A O
ATOM 327 OXT ASN A 44 12.703 4.973 10.746 1.00 7.86 A O1-
TER