graphql-python/graphql-core

Segfault during schema parsing

balinthaller opened this issue · 3 comments

Hey there,

The package is great and has been super reliable for us so far. However, we're running into an intermittent issue where the python exits with code 139 (segfault).

The relevant part of the stack trace, obtained with Python's builtin faulthandler:

main Fatal Python error: Segmentation fault
│ main
│ main Current thread 0x00007f9672f54740 (most recent call first):
│ main   Garbage-collecting
│ main   File "/usr/local/lib/python3.11/site-packages/graphql/language/ast.py", line 361 in __setattr__
│ main   File "/usr/local/lib/python3.11/site-packages/graphql/language/ast.py", line 332 in __init__
│ main   File "/usr/local/lib/python3.11/site-packages/graphql/language/parser.py", line 730 in parse_field_definition
│ main   File "/usr/local/lib/python3.11/site-packages/graphql/language/parser.py", line 1134 in optional_many
│ main   File "/usr/local/lib/python3.11/site-packages/graphql/language/parser.py", line 717 in parse_fields_definition
│ main   File "/usr/local/lib/python3.11/site-packages/graphql/language/parser.py", line 697 in parse_object_type_definition
│ main   File "/usr/local/lib/python3.11/site-packages/graphql/language/parser.py", line 288 in parse_definition
│ main   File "/usr/local/lib/python3.11/site-packages/graphql/language/parser.py", line 1153 in many
│ main   File "/usr/local/lib/python3.11/site-packages/graphql/language/parser.py", line 241 in parse_document
│ main   File "/usr/local/lib/python3.11/site-packages/graphql/language/parser.py", line 113 in parse
│ main   File "/usr/local/lib/python3.11/site-packages/gql/client.py", line 105 in __init__               
  • This happens when trying to parse a GraphQL schema as a string, loaded from a network request.
  • The schema is around 2800 lines, which I presume is considered fairly large.
  • The entry point is through the gql library, but the issue seems to be inside graphql-core
  • I couldn't find what could go wrong based on some digging through the stack trace.
  • The package version we're using is 3.2.3.

Could this be a mislabeled OOM error? I don't think so, but worth considering. Is there anything I could do to profile this better? It's a bit hard as it's running inside a Kubernetes pod and we can't reproduce locally, but I can try.

Cito commented

GQL and GraphQL-Core are pure Python packages, which should not create such segmentation faults. It might either be a OOM issue (but even then it should not segfault in Python), or you're using a C extension in some of your dependencies. The garbage collector can be called any time, so it is probably not related to the stack trace that you are seeing here.

Cito commented

Or maybe you're hitting this bug in Python 3.11.4?

Thanks for the fast response @Cito! These are good pointers, we could easily be hitting the bug you linked, as it seemed to start happening with our update to that exact Python version. Appreciate the help, I'll close this issue then!