openlawlibrary/pygls

`RecursionError: maximum recursion depth exceeded` on certain Python versions

Closed this issue Β· 30 comments

πŸ‘‹ I've received a few reports of this over on the Ruff LSP repo (astral-sh/ruff-vscode#116). I started debugging and reduced it down to an MRE in pygls.

Given server.py:

from pygls import server

LSP_SERVER = server.LanguageServer(name="Example", version="0.1.0")

def start() -> None:
    LSP_SERVER.start_io()

if __name__ == "__main__":
    start()

And the following Dockerfile:

# syntax=docker/dockerfile:1

FROM python:3.7.4-slim-buster

RUN pip3 install pygls
COPY server.py server.py

RUN python -m server

Running docker build . gives me an infinite RecursionError:

 > [4/4] RUN python -m server:
#10 0.367 Traceback (most recent call last):
#10 0.367   File "/usr/local/lib/python3.7/runpy.py", line 193, in _run_module_as_main
#10 0.367     "__main__", mod_spec)
#10 0.367   File "/usr/local/lib/python3.7/runpy.py", line 85, in _run_code
#10 0.367     exec(code, run_globals)
#10 0.367   File "/server.py", line 7, in <module>
#10 0.367     LSP_SERVER = server.LanguageServer(name="Example", version="0.1.0")
#10 0.367   File "/usr/local/lib/python3.7/site-packages/pygls/server.py", line 357, in __init__
#10 0.367     super().__init__(protocol_cls, converter_factory, loop, max_workers)
#10 0.367   File "/usr/local/lib/python3.7/site-packages/pygls/server.py", line 199, in __init__
#10 0.367     self.lsp = protocol_cls(self, converter_factory())
#10 0.367   File "/usr/local/lib/python3.7/site-packages/pygls/protocol.py", line 163, in default_converter
#10 0.367     converter = converters.get_converter()
#10 0.367   File "/usr/local/lib/python3.7/site-packages/lsprotocol/converters.py", line 17, in get_converter
#10 0.367     return _hooks.register_hooks(converter)
#10 0.367   File "/usr/local/lib/python3.7/site-packages/lsprotocol/_hooks.py", line 35, in register_hooks
#10 0.367     _resolve_forward_references()
#10 0.367   File "/usr/local/lib/python3.7/site-packages/lsprotocol/_hooks.py", line 30, in _resolve_forward_references
#10 0.367     attrs.resolve_types(value, lsp_types.ALL_TYPES_MAP, {})
#10 0.367   File "/usr/local/lib/python3.7/site-packages/attr/_funcs.py", line 408, in resolve_types
#10 0.367     hints = typing.get_type_hints(cls, globalns=globalns, localns=localns)
#10 0.367   File "/usr/local/lib/python3.7/typing.py", line 976, in get_type_hints
#10 0.367     value = _eval_type(value, base_globals, localns)
#10 0.367   File "/usr/local/lib/python3.7/typing.py", line 265, in _eval_type
#10 0.367     ev_args = tuple(_eval_type(a, globalns, localns) for a in t.__args__)
#10 0.367   File "/usr/local/lib/python3.7/typing.py", line 265, in <genexpr>
#10 0.367     ev_args = tuple(_eval_type(a, globalns, localns) for a in t.__args__)
#10 0.367   File "/usr/local/lib/python3.7/typing.py", line 266, in _eval_type
#10 0.367     if ev_args == t.__args__:
#10 0.367   File "/usr/local/lib/python3.7/typing.py", line 661, in __eq__
#10 0.367     return frozenset(self.__args__) == frozenset(other.__args__)
#10 0.367   File "/usr/local/lib/python3.7/typing.py", line 667, in __hash__
#10 0.367     return hash((self.__origin__, self.__args__))
#10 0.367   File "/usr/local/lib/python3.7/typing.py", line 666, in __hash__
...
#10 0.367     return hash((Union, frozenset(self.__args__)))
#10 0.367   File "/usr/local/lib/python3.7/typing.py", line 667, in __hash__
#10 0.367     return hash((self.__origin__, self.__args__))
#10 0.367   File "/usr/local/lib/python3.7/typing.py", line 480, in __hash__
#10 0.367     return hash((self.__forward_arg__, self.__forward_value__))
#10 0.367 RecursionError: maximum recursion depth exceeded while calling a Python object

Thanks as always for all your work on pygls!

It looks like python:3.7.7-slim-buster is the minimum version on which this goes away.

tombh commented

Thanks for the clear and helpful report ❀️

I'm not familiar enough with Python to know what this might be about. The only thing that it reminds me of is those problems one can have with type hints and circular imports. They get resolved with those tricks like; if TYPE_CHECKING: or from __future__ import annotations.

Maybe @alcarney will have some insight.

Otherwise, what if we just pin Pygls to python >= 3.7.7?

Unfortunately I am stuck at python 3.7.1.
If pygls would pin python to >= 3.7.7 it would mean no ruff support for me in vscode.
See also astral-sh/ruff-vscode/issues/116.

Maybe @alcarney will have some insight.

Not much I’m afraid… I know that lsprotocol calls _resolve_forward_references() so that it can configure attrs and cattrs to correctly serialise/deserialise lsp messages. This converts any type annotations that are strings into concrete types so I imagine it would undo any potential benefits from from __future__ import annotations or similar tricks.

Perhaps @karthiknadig would know of a workaround?

If you need to get to the concrete types of a type hint then there's no workaround short of directly changing the code to somehow avoid triggering this specific case as you have to manifest them somehow.

I will also say that Python 3.7 hits EOL at the end of June this year (https://devguide.python.org/versions/#versions), so depending on your compatibility policies for pygls, any fix would only be for compatibility for 5 months for bugfix versions that CPython itself doesn't support since newer versions of Python 3.7 are out and have had the fix (specifically, since March 10, 2020 when Python 3.7.7 was released).

It is indeed not unreasonable to drop support for older python 3.7 versions.

tombh commented

Thanks for the replies everybody.

So what I'm understanding is that it's Pygls' call to lsprotocl.converters.get_converter() that's the problem right? And we do that in order to automatically serialise/deserialise JSON API objects. Is this an unintended use of lsprotcol's converters? Or could this mean lsprotocl.converters.get_converter() itself doesn't work for Python <3.7.7?

@tombh I will need to investigate and see if fully resolving this is really needed or if it can be done on the fly. If it turns out that we have to resolve all types before we can build the converters then yes this means it won't work for < 3.7.7.

I'll reply back with my findings.

I have a potential fix for this, might need some help testing it out. It is a minor change to how LSPAny is defined.

Currently:

LSPArray = List['LSPAny']
LSPAny = Union["LSPObject", LSPArray, str, int, int, float, bool, None]  # generated from spec
LSPObject = object

Fix:

LSPArray = List['LSPAny']
LSPAny = Union[Any, None]
LSPObject = object

we had to previously pin LSPObject to object, so the definition here LSPAny = Union["LSPObject", LSPArray, str, int, int, float, bool, None] was effectively LSPAny = Union[object, LSPArray, str, int, int, float, bool, None] at that point it could just be Union[Any, None]. This makes the resolver happy.

tombh commented

That sounds promising. What's a good way to test that? Just copy and paste the change to around line 688 in lsprotocol/types.py?

Replace the entire file from lsprotocol main: https://github.com/microsoft/lsprotocol/blob/main/lsprotocol/types.py that should be enough.

The above-mentioned change seems to work, I ran the pygls test suite, and tested a few other servers on it. I plan on releasing the fix next week.

Published lsprotocol to pypi with the fix.

Does pygls need to be updated too?

I see:

python -m unittest
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/Users/crmarsh/workspace/ruff-lsp/ruff_lsp/__main__.py", line 19, in <module>
    main()
  File "/Users/crmarsh/workspace/ruff-lsp/ruff_lsp/__main__.py", line 5, in main
    from ruff_lsp import __version__, server
  File "/Users/crmarsh/workspace/ruff-lsp/ruff_lsp/server.py", line 52, in <module>
    from pygls import protocol, server, uris, workspace
  File "/usr/local/lib/python3.11/site-packages/pygls/protocol.py", line 56, in <module>
    from lsprotocol.types import (
ImportError: cannot import name 'WorkspaceConfigurationParams' from 'lsprotocol.types' (/usr/local/lib/python3.11/site-packages/lsprotocol/types.py)

(Pinning pygls==1.0.0a3 resolved for me.)

@charliermarsh LSP spec has changed the definition of WorkspaceConfigurationRequest to not have a combined type which was aliased in lsprotocol as WorkspaceConfigurationParams for params. Looks like the way it was defined earlier might have been a bug. It is now just ConfigurationParams.

I'll make the fix for that in lsprotocol to preserve the aliasing we use for combined types for reference types as well.

This was the old definition (params was composed of two things) the combined type was aliased as WorkspaceConfigurationParams:

        {
            "method": "workspace/configuration",
            "result": {
                "kind": "array",
                "element": {
                    "kind": "reference",
                    "name": "LSPAny"
                }
            },
            "messageDirection": "serverToClient",
            "params": {
                "kind": "and",
                "items": [
                    {
                        "kind": "reference",
                        "name": "ConfigurationParams"
                    },
                    {
                        "kind": "reference",
                        "name": "PartialResultParams"
                    }
                ]
            },
            "documentation": "The 'workspace/configuration' request is sent from the server to the client to fetch a certain\nconfiguration setting.\n\nThis pull model replaces the old push model were the client signaled configuration change via an\nevent. If the server still needs to react to configuration changes (since the server caches the\nresult of `workspace/configuration` requests) the server should register for an empty configuration\nchange event and empty the cache if such an event is received."
        }

This is the new definition:

        {
            "method": "workspace/configuration",
            "result": {
                "kind": "array",
                "element": {
                    "kind": "reference",
                    "name": "LSPAny"
                }
            },
            "messageDirection": "serverToClient",
            "params": {
                "kind": "reference",
                "name": "ConfigurationParams"
            },
            "documentation": "The 'workspace/configuration' request is sent from the server to the client to fetch a certain\nconfiguration setting.\n\nThis pull model replaces the old push model were the client signaled configuration change via an\nevent. If the server still needs to react to configuration changes (since the server caches the\nresult of `workspace/configuration` requests) the server should register for an empty configuration\nchange event and empty the cache if such an event is received."
        }

I will preserve the aliasing in lsprotocol in cases like this.

tombh commented

So it looks like sublimelsp/LSP-ruff#8 and astral-sh/ruff-vscode#116 are fixed now with the latest lsprotocol?

@karthiknadig Thanks for all the fixes ✨

@charliermarsh I was just wondering if you knew that Pygls' latest version is 1.0.1. Do you think pinning to that would have also solved your issue?

@tombh I need to push out another release I will be doing it today.

@tombh - Yeah I did notice that, thank you! But upgrading didn't solve in this case. (I'll upgrade to 1.0.1 once the new lsprotocol release is out.)

tombh commented

So is it that you pinned down to 1.0.0a3 from a newer version, rather than up to it from an older version? I'm just wondering because Pygls doesn't actually pin to any version of lsprotocol, so pinning to a different version of Pygls shouldn't change anything. I'm guessing that 1.0.0a3 fixed it for you simply by virtue of updating the unpinned lsprotocol. In which case updating to any version of Pygls would have worked. Do you see what I mean?

I pinned down to 1.0.0a3.

Using pygls>=1.0.0a3, my dependencies locked to pygls==1.0.1 and lsprotocol==2023.0.0a0, which gave me the above error.

Pinning to pygls==1.0.0a3, my dependencies locked to pygls==1.0.0a3 and lsprotocol==2023.0.0a0, which resolved the error. So perhaps there was a change that went out in 1.0.1 that referenced WorkspaceConfigurationParams?

Works! Thanks tons!

tombh commented

@charliermarsh you were right πŸ€“ I checked, there were a bunch of commits relevant to WorkspaceConfigurationParams on January 3rd that made it into v1.0.1. Ok, so even though we at Pygls did that, it wasn't until the recent fixes from lsprotocol that the WorkspaceConfigurationParams issue you found surfaced.

Anyway, so long story short, now with the latest fix from Karthik, everything's good I think?

Yup! All good from my perspective.

@tombh I added pygls tests to lsprotocol CI as smoke tests, so we should be able to catch these in the future before release:
microsoft/lsprotocol#177

tombh commented

Great idea @karthiknadig, thank you.

So I think we can close this issue then. The fix is essentially to reinstall Pygls. Or, for an end user, perhaps even just lsprotocl, eg: pip install lsprotocol==2023.0.0a1.

I think there's a broader issue here in terms of version pinning. Ideally I think Pygls should have more control of the lsprotocol version it uses. I'm leaning towards pinning to a specific version, even though that means we'll have to make new releases for every lsprotocol change. @karthiknadig I wonder what you think about semantic versioning for lsprotocol?

I wonder what you think about semantic versioning for lsprotocol?

We did and my answer is https://snarky.ca/why-i-dont-like-semver/ πŸ˜‰. But I don't think you're asking for SemVer but instead a version number that somehow reflects the current version of the LSP spec that a version of lsprotocol supports? At which point we get left with a single digit (the micro version) to represent our releases and thus no concept of a bugfix or feature release.

Now, technically Python supports an infinite number of version subcomponents, so we could make it go beyond 3 subcomponents: <LSP major>.<LSP minor>.<lsprotocol major>.<lsprotocol minor>.<lsprotocol micro>. Is this what you're after?

But I would still ask you consider whether you truly expect future versions of the library and the LSP spec to not be backwards-compatible before we consider changing the versioning so you can cap the max version (and I would point you to https://iscinumpy.dev/post/bound-version-constraints/ as to why I would advise against that).

tombh commented

Oh sure, that's no problem at all. Interesting to read your take on it. I'm totally open to pinning to whatever version of versioning there is, ha. I want to open an issue in Pygls here about pinning to lsprotocol, and I can imagine one of the suggestions would be to pin loosely. But I'm going to argue that pinning specifically, rather than not at all, is probably the better path to take.

(Closing as this is resolved for me!)