read_mol2 fails on normal-looking file

Question

read_mol2 fails on normal-looking file

Closed this issue a year ago · 2 comments

Describe the bug

Steps/Code to Reproduce

PandasMol2().read_mol2('file3_mod.mol2')

Expected Results

I expect biopandas to properly interpret the following mol2 file (uploaded with .txt extension for compatibility with markdown):

file3_mod.txt

Actual Results

See screenshots of the traceback:

Versions

biopandas 0.4.1
Linux-5.15.109+-x86_64-with-glibc2.35
Python 3.10.6 (main, May 29 2023, 11:10:38) [GCC 11.3.0]
Scikit-learn 1.3.0
NumPy 1.22.4
SciPy 1.11.1

Answer 1 · 2023-07-30T07:22:22.000Z

Hi @RafiBrent sorry to hear you’re experiencing a bug. Unfortunately I don’t believe we have capacity as maintainers to resolve this right now. I’d hope either you or someone from the community could contribute a fix on this occasion.

Answer 2 · 2023-09-19T15:39:12.000Z

I think I find the problem. There is an empty line between your ATOM and BOND blocks. In the code:

for idx, s in enumerate(mol2_lst):
    if s.startswith("@<TRIPOS>ATOM"):
        first_idx = idx + 1
        started = True
    elif started and s.startswith("@<TRIPOS>"):
        last_idx_plus1 = idx
        break

The empty line is also counted as an ATOM line. If you have a single file, just deleting it should work. If you have many files, you can either try my pdbx2df and use the read_mol2 function or wait for my PR for this repo.