thierry-martinez/pyml

Enhanced tracebacks with python 3.11

Closed this issue · 6 comments

We are building all Fedora packages with the current beta release of python 3.11, to identify problems before the 3.11 release. The pyml package fails a test:

Starting tests...
Test 'version' ... Python version 3.11.0b4 (main, Jul 22 2022, 00:00:00) [GCC 12.1.1 20220628 (Red Hat 12.1.1-3)]
passed
Test 'library version' ... Python library version 3.11.0b4 (main, Jul 22 2022, 00:00:00) [GCC 12.1.1 20220628 (Red Hat 12.1.1-3)]
passed
Test 'hello world' ... passed
Test 'class' ... passed
Test 'empty tuple' ... passed
Test 'make tuple' ... passed
Test 'module get/set/remove' ... passed
Test 'capsule' ... passed
Test 'capsule-conversion-error' ... passed
Test 'exception' ... passed
Test 'ocaml exception' ... passed
Test 'ocaml exception with traceback' ... Traceback (most recent call last):
  File "<string>", line 6, in <module>
  File "file2.ml", line 2, in func2
  File "file1.ml", line 1, in func1
Exception: Great

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib64/python3.11/traceback.py", line 353, in _walk_tb_with_full_positions
    positions = _get_code_position(tb.tb_frame.f_code, tb.tb_lasti)
  File "/usr/lib64/python3.11/traceback.py", line 367, in _get_code_position
    return next(itertools.islice(positions_gen, instruction_index // 2, None))
StopIteration

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<string>", line 10, in <module>
  File "/usr/lib64/python3.11/traceback.py", line 74, in extract_tb
    return StackSummary._extract_from_extended_frame_gen(
  File "/usr/lib64/python3.11/traceback.py", line 416, in _extract_from_extended_frame_gen
    for f, (lineno, end_lineno, colno, end_colno) in frame_gen:
RuntimeError: generator raised StopIteration
raised an exception: File "pyml_tests.ml", line 181, characters 6-12: Assertion failed
Test 'restore with null' ... passed
Test 'ocaml other exception' ... passed
Test 'run file with filename' ... XXX lineno: 1, opcode: 151

The failure may be due to a change in the traceback format: https://docs.python.org/3.11/whatsnew/3.11.html#enhanced-error-locations-in-tracebacks.

Indeed, changing line 190 of pyml_tests.ml from:

        filenames = [f.filename for f in traceback.extract_tb(err.__traceback__)]

to:

        filenames = [f.filename for f in traceback.StackSummary.extract(traceback.walk_tb(err.__traceback__))]

gets the test to pass.

Some later tests were failing in weird ways. It turns out that the "ocaml other exception" test is to blame. When the OCaml exception is raised in that test, none of the python cleanup code is called. We are left with _Pyruntime.gilstate.tstate_current->cframe pointing to a frame on the stack. When the next test runs, it overwrites the frame structure with whatever it pushes onto the stack, leading to very weird failures down the road.

Thank you, @jamesjer, for your report and your analysis, and sorry for the delay! I think I finally fixed this in f682f97: Python 3.11 interpreter didn't like to be interrupted by an OCaml exception.

Great, I will give it a try. Thank you!

That commit works great. I did some comparisons with the python 3.11 header files, and wonder if any of these should be addressed as well:

  • Starting in python 3.8, PyCompilerFlags has a second field, cf_feature_version. That field is not declared in pyml_stubs.h, so the malloc in pyml_unwrap_compilerflags asks for too few bytes. I think that field should be set to version_minor. That should be okay for older Python interpreters, which won't read those bytes.
  • There are two exceptions not mentioned in generate.ml: PyExc_EncodingWarning (added in python 3.10) and PyExc_ResourceWarning (added in python 3.2).
  • PyMarshal_WriteObjectToFile is declared as returning Int in generate.ml, but it actually has return type of void; i.e., Unit. This is the case as far back as python 2.6. I didn't look further back than that. This also makes the assert_int_success on line 2651 of py.ml suspect.
  • PySet_Clear is the other way around: it is declared as returning Unit in generate.ml, but it actually returns Int. This is also the case as far back as python 2.6.

Thank you very much for having carefully reviewing this! These differences should be fixed in 7fd6f0c . I hope to have tools for checking these kinds of things more systematically in a near future.