read_mol2 fails on normal-looking file
Closed this issue · 2 comments
Describe the bug
Steps/Code to Reproduce
PandasMol2().read_mol2('file3_mod.mol2')
Expected Results
I expect biopandas to properly interpret the following mol2 file (uploaded with .txt extension for compatibility with markdown):
Actual Results
See screenshots of the traceback:
Versions
biopandas 0.4.1
Linux-5.15.109+-x86_64-with-glibc2.35
Python 3.10.6 (main, May 29 2023, 11:10:38) [GCC 11.3.0]
Scikit-learn 1.3.0
NumPy 1.22.4
SciPy 1.11.1
Hi @RafiBrent sorry to hear you’re experiencing a bug. Unfortunately I don’t believe we have capacity as maintainers to resolve this right now. I’d hope either you or someone from the community could contribute a fix on this occasion.
I think I find the problem. There is an empty line between your ATOM
and BOND
blocks. In the code:
for idx, s in enumerate(mol2_lst):
if s.startswith("@<TRIPOS>ATOM"):
first_idx = idx + 1
started = True
elif started and s.startswith("@<TRIPOS>"):
last_idx_plus1 = idx
break
The empty line is also counted as an ATOM
line. If you have a single file, just deleting it should work. If you have many files, you can either try my pdbx2df and use the read_mol2 function or wait for my PR for this repo.