Qix-/better-exceptions

Multiline statements/expressions and context

Opened this issue · 3 comments

I find it annoying when tracebacks only show a single line for a frame, when that frame may be executing a multiline expression or frame and that single line leaves out critical context. I see that better_exceptions doesn't address this. Here's an example:

import better_exceptions

better_exceptions.hook()


def div(x, y):
    return y / x


a = 1
b = 0
div(
    b,
    a,
)

Output:

Traceback (most recent call last):
  File "/Users/alexhall/Library/Preferences/PyCharm2018.3/scratches/scratch_400.py", line 14, in <module>
    a,
    └ 1
  File "/Users/alexhall/Library/Preferences/PyCharm2018.3/scratches/scratch_400.py", line 7, in div
    return y / x
           │   └ 0
           └ 1
ZeroDivisionError: division by zero

In particular this doesn't show the function div or the argument b (which causes the exception) at the top. It seems that multiline function calls generally show the last line containing significant tokens (i.e. not the trailing )) which is often not going to be what the user wants.

There was a discussion about solving this and showing all the lines of a multiline statement in the traceback: https://bugs.python.org/issue12458 It doesn't look like that's going to happen anytime soon, but a change has been merged which shows the first line instead of the last, so in the example above it shows div( in Python 3.8 (I just tested it).

Some tools such as IPython (https://github.com/Qix-/better-exceptions/issues/10#issuecomment-513497696) and Sentry show a few lines of context before and after the line that the traceback points to. This context may be useful in its own right and some users may want something like that. But I don't know how useful it is and it makes tracebacks significantly longer. I don't think that should be the default. And I suspect it's partly done as a crude solution to the multiline statement problem.

It's easy to parse the source file and find the statement node containing a given line. I've done it in other projects. We'd need to think a bit about what to display for compound statements (e.g. if statements) but overall I think it's very doable. Then we can show the whole statement being executed without unwanted additional context. In most cases it would still be just one line.

The obvious problem is how to format the display of variables when showing multiple lines of source code. I'd like to brainstorm this issue. Some thoughts:

  1. The current system of labels with vertical lines looks great. I think it should still be used where possible, even if that introduces inconsistency and some frames are displayed differently from others.
  2. For statements with two lines, the current system can still work by showing variables both above and below the source lines, i.e. some label lines point up and others down, and the whole thing would look somewhat symmetric. The problem really arises with 3 or more lines.
  3. Theoretically we could just insert the current variable labels below each line, so the variables and source lines would be interleaved. This is probably a bad idea as it would make the source much less readable.
  4. The simplest option is to just list the variables below the source lines.
    1. https://github.com/cknd/stackprinter enhances this method slightly by highlighting each variable in the source and in the list with a unique colour instead of using syntax highlighting. I don't know if this actually improves readability.
    2. It may be useful to (optionally?) show all the local variables in a frame and not just the ones directly relevant to a line or statement. This is probably the only suitable way to do that.
  5. Variables could be shown at the end of a line. This is what the PyCharm debugger does:
    Screen Shot 2019-07-20 at 22 26 29
    However horizontal space runs out fast.
  6. Similar to point 1, a single frame could have multiple systems for displaying variables, e.g. a mix of 2, 4, and 5. But that would probably be confusing and a bad idea.
  7. One day we may want to display the value of expressions like a[b] where evaluating that is safe, e.g. if type(a) is dict. The current system of labels would not work well for labelling all three of a, b, and a[b].

For the terminal I think it is a tough decision, and one of the reason I want to separate generating a data structure that contain the relevant information and the rendering is that in Jupyter Notebook, for example, we use HTML to have 1) collapsible elements and 2) values as a tooltip on hover.

I find having the context also quite more powerful than the value themselves; but not only for multiline statements; but to understand what the author of the code meant w/o having to actually go open the file.

cknd commented

Hello! I wanted to leave a +1 for multiple lines of code context & for variables set apart from code -- what motivated me to write stackprinter despite knowing better-exception was that I liked IPython's multiline code contexts, I just wanted to see all variables from that context too. But I found it hard to skim/recognize code at a glance when the source lines were interleaved with variable values, especially if the values themselves are multi-line, like prettyprinted dicts or numpy arrays. So I felt the optimum (for terminals) is after all to separate the two. Colors may help tie them back together a bit, but in practice it still seems to work well even in plaintext (The main usability secret might be to only show the variables that occur in the visible source context, I think. At least by default)
image

@Carreau Something I had long planned was to move this frame inspection module into its own package, so arbitrary formatters could sit on top (both terminals or HTML for notebooks). Would that be interesting at all?

The module takes a frame and returns a tokenized map of the code (that knows where which variable occurs), together with the variable values (plus some minor niceties like dot lookup). That allows to rerender the original source character-by-character, but with arbitrary annotations on the variable names. It currently feeds into the terminal color thing in the screenshot above, but of course once you have an annotated version of the source it's also straightforward to e.g. generate a source listing in the browser where names in the code can be clicked / hovered to see their values, etc.

I'd be very happy if anyone could put source-code-with-debug-values into jupyter, I just wanted to offer this since I have it sitting around & would be happy to help push it along.

@Carreau I've started work on a library https://github.com/alexmojaki/stack_data which focuses just on the data aspect of tracebacks. I'm currently integrating it into stackprinter in cknd/stackprinter#23 because it's an easier place to get started. IPython/Jupyter will be next, followed by better-exceptions if possible.