Stranger6667/jsonschema-rs

[python] "OverflowError: int too big to convert" after several failed validation attempts

Closed this issue · 2 comments

After several failed validation attempts (7 in my case), validation passes and starts to fail with an overflow error at any use of strings.

Example json: bad_json.txt

import json
import jsonschema_rs

INPUT_SCHEMA = {
    "items": {
        "properties": {
            "create_time": {"type": "string"},
            "meta": {"anyOf": [{"type": "object"}, {"type": "null"}]},
        },
        "required": ["create_time"],
        "type": "object",
        "additionalProperties": False,
    },
    "minItems": 1,
    "type": "array",
}

schema = jsonschema_rs.JSONSchema(INPUT_SCHEMA)
for i in range(10):
    with open("bad_json.txt", "rb") as file:
        b = file.read()
        j = json.loads(str(b, encoding="utf-8"))
    try:
        schema.validate(j)
    except SystemError:
        pass
    else:
        print("123")

Traceback (most recent call last):
  File "/home/aleksandr/luna-events/test_json.py", line 28, in <module>
    print("123")
    ^^^^^
OverflowError: int too big to convert

Some things I learned while poking around:

The issue is triggered by the "long int" in bad_json.txt. The following code (without bad_json.txt) still triggers the reported behavior:

import sys
import json
import jsonschema_rs

INPUT_SCHEMA = {"type": "number"}
LONG_INT = -85232295681570168738799618114762280803

schema = jsonschema_rs.JSONSchema(INPUT_SCHEMA)

for i in range(10):
    print(i)
    try:
        schema.validate(LONG_INT)
    except SystemError as ex:
        pass
    else:
        print("123")

The exception that is being caught by except SystemError as ex: is the OverflowError, wrapped in a SystemError:

OverflowError: int too big to convert

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/kgutwin/github/jsonschema-rs/bindings/python/issue_435.py", line 8, in <module>
    schema.validate(LONG_INT)
SystemError: <method 'validate' of 'jsonschema_rs.JSONSchema' objects> returned a result with an exception set

And this is probably getting raised from a call to PyLong_AsLongLong() failing to convert the extra-large int into an i64 somewhere in the Python-to-Rust binding.

I'm guessing there are two problems, one more apparent than the other. Some plumbing may be needed on the Rust side to raise the OverflowError as soon as it happens (perhaps by calling PyErr_Occurred() after the call to PyLong_AsLongLong()?) so that it doesn't result in the more generic SystemError getting raised. But the behavior where the exception isn't properly caught after the 7th time through the loop is much stranger. I can't identify a reason why it happens, consistently, after 7 turns around the loop. And the OverflowError that does eventually get raised is consistently applied to the wrong line of code. I'm suspicious that a Python bug or undefined behavior is at play there.

Thank you for reporting it and sorry for the delay. An important note is that it only happens on Python 3.12 as far as I can tell