Issue parsing nested equation / scalebox / cases + need for more detailed error messages
tvercaut opened this issue · 1 comments
tvercaut commented
I am trying to extract the title, author and abstract of a number of latex files. TexSoup has proven very useful for this purpose already.
While doing so, I however stumbled on an issue to parse a complex nexted expression involving an equation, a scalebox and a cases environment. Below is a small test case to reproduce:
#!/usr/bin/env python3
from TexSoup import TexSoup
tex_doc = r"""
\documentclass{article}
\usepackage{graphicx}
\begin{document}
\begin{equation}
\scalebox{2.0}{$x =
\begin{cases}
1, & \text{if } y=1 \\
0, & \text{otherwise}
\end{cases}$}
\end{equation}
\end{document}
"""
soup = TexSoup(tex_doc)
print(list(soup))
It took me a while to find out the offensice code as teh error message only said:
EOFError: Expecting $. Reached end of file.
For my use case, the following things would have been very useful:
- allow TexSoup to ignore parsing errors and continue (as I would assume the title, abstract and authors should already have been parsed correctly when this error was encountered)
- provide a more detailled error message including for example the location of the start of the offensive expression
alvinwan commented
Thanks for the feedback! Adding a fault-tolerant flag to merge soon. Have also amended several of the most common parse errors to be more informative (including line no and offset)