OBO parsing fails with a misleading error message
mukund109 opened this issue · 1 comments
mukund109 commented
When parsing cellosaurus.obo
downloaded from here, I get this error.
The error is fixed when I manually remove
- lines starting with
!
(lines 27-48) - line with date information
date: 05:20:2021 12:00
(line 3)
althonos commented
Hi @mukund109 , three things:
- concerning the top error error message, this seems to be an IPython thing, because when I run it with the base Python shell I actually get the location of the syntax error in the OBO file:
>>> import pronto >>> pronto.Ontology("cellosaurus.obo") File "<stdin>", line 3 date: 05:20:2021 12:00␊ ^ SyntaxError: expected NaiveMonth The above exception was the direct cause of the following exception: Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/althonos/.local/lib/python3.9/site-packages/pronto/ontology.py", line 283, in __init__ cls(self).parse_from(_handle) # type: ignore File "/home/althonos/.local/lib/python3.9/site-packages/pronto/parsers/obo.py", line 18, in parse_from doc = fastobo.iter(handle, ordered=True) TypeError: expected path or binary file handle
- concerning the error type (the fact that you get a
TypeError
where you would expect theSyntaxError
to be raised directly), this is a bug infastobo-py
, I'm fixing it ATM. - concerning the
cellosaurus.obo
file, it cannot be parsed because it is not compatible with the OBO format version 1.4, it contains many syntax errors (thatpronto
cannot ignore by design). If you are in conttact with the ExPasy people, you can try reporting it to them directly, otherwise I'll send them an email to see what we can do about it. Most of the issues seem to be concentrated in the Xref IDs so that shouldn't be to hard to fix.