Error handling when reading wrong file formats
dominiquesydow opened this issue · 3 comments
Describe the workflow you want to enable
Thanks again for your work on biopandas
!
I have a small comment on the error handling when loading pdb files with read_mol2
(or mol2 files with read_pdb
).
The current behavior looks like this:
mol2
module
from biopandas.mol2 import PandasMol2
pmol = PandasMol2()
pmol.read_mol2("xxxx.pdb")
Example output:
UnboundLocalError: local variable 'first_idx' referenced before assignment
(might look different depending on the input file and file format).
pdb
module
from biopandas.pdb import PandasPdb
ppdb = PandasPdb()
ppdb.read_pdb("xxxx.mol2")
Example output: All data is loaded into the dict key "OTHER" (might look different depending on the input file and file format).
Describe your proposed solution
Would you consider adding a check for the correct input and throwing a descriptive error message?
I am using a ValueError
at the moment but I am sure there are nicer ways to handle this:
https://github.com/volkamerlab/opencadd/blob/912d4e98e89edf38707249fd4f034cea136e1932/opencadd/io/dataframe.py#L202
This issue is not urgent at all.
It simply would make it easier / less verbose to use biopandas
in other packages where we try to catch common user mistakes.
Thank you again for your time and work!
Describe alternatives you've considered, if relevant
None.
Additional context
None.
Thanks for the feedback! I didn't think of user errors like that (yet) and like the suggestion returning a more descriptive ValueError, "No structural data could be loaded. Is the input text in mol2 format?". I'd appreciate a PR if you have time some time.