openlawlibrary/pygls

Be less strict with wrong LSP types?

Opened this issue · 8 comments

Description of a VS Code bug I experienced
While writing a Language Server using pygls, I experienced a VS Code issue where the LaTeX Workshop extension (which doesn’t use LSP, but is a ‘usual’ typescript VS Code extension) publishes a diagnostic entry at line NaN. To my understanding, this is in spec with the corresponding VS Code types used in typescript, but still a bit weird. [corresponding LaTeX Workshop issue] However, when sending Code Action Requests, VS Code sends "line": null which, to my understanding, is not in spec with the LSP. [corresponding VS Code issue] To make it even worse, these diagnostic entries are sent with all Code Action Requests, although they are only displayed in the first line in VS Code. As a result pygls shows an error (see below for details) whenever one moves in the document, because VS Code will send Code Action Requests.

Suggestion for pygls
The question I want to bring up here is, whether it would be feasible to handle wrong types less strict?

And so far I would suggest the following: Assume pygls gets a valid JSON message, but with wrong types. Whenever these wrong types live inside an optional entry (i.e. an item in a list or an entry marked as optional in LSP), one could just remove the optional parts, issue a warning, but proceed. In my case, the faulty diagnostics should just be removed from the Code Action Request. If there were additional valid diagnostics from my own language server, they would survive.

The error I experienced
For how to reproduce the faulty diagnostics published by the LaTeX Workshop extension, see the corresponding LaTeX Workshop and VS Code issues. Here I want to just give the JSON message sent to pygls and the corresponding log. Note that there were no good diagnostics from my own language server in the test document, the diagnostics come from the LaTeX Workshop extension.

JSON message

{
    "jsonrpc": "2.0",
    "id": 10,
    "method": "textDocument/codeAction",
    "params":{
        "textDocument":{
            "uri": "file:///.../test.tex"
        },
        "range":{
            "start":{
                "line": 7,
                "character": 14
            },
            "end":{
                "line": 7,
                "character": 14
            }
        },
        "context":{
            "diagnostics":[
                {
                    "range":{
                        "start":{
                            "line": null,
                            "character": 0
                        },
                        "end":{
                            "line": null,
                            "character": 65535
                        }
                    },
                    "message": "Using \\overbracket and \\underbracket from\n(unicode-math)\t`mathtools' package.\n(unicode-math)\t\n(unicode-math)\tUse \\Uoverbracket and \\Uunderbracket for\n(unicode-math)\toriginal `unicode-math' definition.\n",
                    "severity": 2,
                    "source": "LaTeX"
                },
                {
                    "range":{
                        "start":{
                            "line": null,
                            "character": 0
                        },
                        "end":{
                            "line": null,
                            "character": 65535
                        }
                    },
                    "message": "I'm going to overwrite the following commands\n(unicode-math)\tfrom the `mathtools' package: \n(unicode-math)\t\n(unicode-math)\t\\dblcolon, \\coloneqq, \\Coloneqq, \\eqqcolon.\n(unicode-math)\t\n(unicode-math)\t\n(unicode-math)\tNote that since I won't overwrite the other\n(unicode-math)\tcolon-like commands, using them will lead to\n(unicode-math)\tinconsistencies.\n",
                    "severity": 2,
                    "source": "LaTeX"
                }
            ]
        }
    }
}

pygls log

ERROR:pygls.protocol:Unable to deserialize message
  + Exception Group Traceback (most recent call last):
  |   File "/…/myenv/lib/python3.10/site-packages/pygls/protocol.py", line 404, in _deserialize_message
  |     return self._converter.structure(data, request_type)
  |   File "/…/myenv/lib/python3.10/site-packages/cattrs/converters.py", line 309, in structure
  |     return self._structure_func.dispatch(cl)(obj, cl)
  |   File "<cattrs generated structure lsprotocol.types.TextDocumentCodeActionRequest>", line 26, in structure_TextDocumentCodeActionRequest
  |     if errors: raise __c_cve('While structuring ' + 'TextDocumentCodeActionRequest', errors, __cl)
  | cattrs.errors.ClassValidationError: While structuring TextDocumentCodeActionRequest (1 sub-exception)
  +-+---------------- 1 ----------------
    | Exception Group Traceback (most recent call last):
    |   File "<cattrs generated structure lsprotocol.types.TextDocumentCodeActionRequest>", line 10, in structure_TextDocumentCodeActionRequest
    |     res['params'] = __c_structure_params(o['params'], __c_type_params)
    |   File "<cattrs generated structure lsprotocol.types.CodeActionParams>", line 31, in structure_CodeActionParams
    |     if errors: raise __c_cve('While structuring ' + 'CodeActionParams', errors, __cl)
    | cattrs.errors.ClassValidationError: While structuring CodeActionParams (1 sub-exception)
    | Structuring class TextDocumentCodeActionRequest @ attribute params
    +-+---------------- 1 ----------------
      | Exception Group Traceback (most recent call last):
      |   File "<cattrs generated structure lsprotocol.types.CodeActionParams>", line 15, in structure_CodeActionParams
      |     res['context'] = __c_structure_context(o['context'], __c_type_context)
      |   File "<cattrs generated structure lsprotocol.types.CodeActionContext>", line 21, in structure_CodeActionContext
      |     if errors: raise __c_cve('While structuring ' + 'CodeActionContext', errors, __cl)
      | cattrs.errors.ClassValidationError: While structuring CodeActionContext (1 sub-exception)
      | Structuring class CodeActionParams @ attribute context
      +-+---------------- 1 ----------------
        | Exception Group Traceback (most recent call last):
        |   File "<cattrs generated structure lsprotocol.types.CodeActionContext>", line 5, in structure_CodeActionContext
        |     res['diagnostics'] = __c_structure_diagnostics(o['diagnostics'], __c_type_diagnostics)
        |   File "/…/myenv/lib/python3.10/site-packages/cattrs/converters.py", line 510, in _structure_list
        |     raise IterableValidationError(
        | cattrs.errors.IterableValidationError: While structuring typing.List[lsprotocol.types.Diagnostic] (2 sub-exceptions)
        | Structuring class CodeActionContext @ attribute diagnostics
        +-+---------------- 1 ----------------
          | Exception Group Traceback (most recent call last):
          |   File "/…/myenv/lib/python3.10/site-packages/cattrs/converters.py", line 502, in _structure_list
          |     res.append(handler(e, elem_type))
          |   File "<cattrs generated structure lsprotocol.types.Diagnostic>", line 56, in structure_Diagnostic
          |     if errors: raise __c_cve('While structuring ' + 'Diagnostic', errors, __cl)
          | cattrs.errors.ClassValidationError: While structuring Diagnostic (1 sub-exception)
          | Structuring typing.List[lsprotocol.types.Diagnostic] @ index 0
          +-+---------------- 1 ----------------
            | Exception Group Traceback (most recent call last):
            |   File "<cattrs generated structure lsprotocol.types.Diagnostic>", line 5, in structure_Diagnostic
            |     res['range'] = __c_structure_range(o['range'], __c_type_range)
            |   File "<cattrs generated structure lsprotocol.types.Range>", line 14, in structure_Range
            |     if errors: raise __c_cve('While structuring ' + 'Range', errors, __cl)
            | cattrs.errors.ClassValidationError: While structuring Range (2 sub-exceptions)
            | Structuring class Diagnostic @ attribute range
            +-+---------------- 1 ----------------
              | Exception Group Traceback (most recent call last):
              |   File "<cattrs generated structure lsprotocol.types.Range>", line 5, in structure_Range
              |     res['start'] = __c_structure_start(o['start'], __c_type_start)
              |   File "<cattrs generated structure lsprotocol.types.Position>", line 14, in structure_Position
              |     if errors: raise __c_cve('While structuring ' + 'Position', errors, __cl)
              | cattrs.errors.ClassValidationError: While structuring Position (1 sub-exception)
              | Structuring class Range @ attribute start
              +-+---------------- 1 ----------------
                | Traceback (most recent call last):
                |   File "<cattrs generated structure lsprotocol.types.Position>", line 5, in structure_Position
                |     res['line'] = __c_structure_line(o['line'])
                | TypeError: int() argument must be a string, a bytes-like object or a real number, not 'NoneType'
                | Structuring class Position @ attribute line
                +------------------------------------
              +---------------- 2 ----------------
              | Exception Group Traceback (most recent call last):
              |   File "<cattrs generated structure lsprotocol.types.Range>", line 10, in structure_Range
              |     res['end'] = __c_structure_end(o['end'], __c_type_end)
              |   File "<cattrs generated structure lsprotocol.types.Position>", line 14, in structure_Position
              |     if errors: raise __c_cve('While structuring ' + 'Position', errors, __cl)
              | cattrs.errors.ClassValidationError: While structuring Position (1 sub-exception)
              | Structuring class Range @ attribute end
              +-+---------------- 1 ----------------
                | Traceback (most recent call last):
                |   File "<cattrs generated structure lsprotocol.types.Position>", line 5, in structure_Position
                |     res['line'] = __c_structure_line(o['line'])
                | TypeError: int() argument must be a string, a bytes-like object or a real number, not 'NoneType'
                | Structuring class Position @ attribute line
                +------------------------------------
          +---------------- 2 ----------------
          | Exception Group Traceback (most recent call last):
          |   File "/…/myenv/lib/python3.10/site-packages/cattrs/converters.py", line 502, in _structure_list
          |     res.append(handler(e, elem_type))
          |   File "<cattrs generated structure lsprotocol.types.Diagnostic>", line 56, in structure_Diagnostic
          |     if errors: raise __c_cve('While structuring ' + 'Diagnostic', errors, __cl)
          | cattrs.errors.ClassValidationError: While structuring Diagnostic (1 sub-exception)
          | Structuring typing.List[lsprotocol.types.Diagnostic] @ index 1
          +-+---------------- 1 ----------------
            | Exception Group Traceback (most recent call last):
            |   File "<cattrs generated structure lsprotocol.types.Diagnostic>", line 5, in structure_Diagnostic
            |     res['range'] = __c_structure_range(o['range'], __c_type_range)
            |   File "<cattrs generated structure lsprotocol.types.Range>", line 14, in structure_Range
            |     if errors: raise __c_cve('While structuring ' + 'Range', errors, __cl)
            | cattrs.errors.ClassValidationError: While structuring Range (2 sub-exceptions)
            | Structuring class Diagnostic @ attribute range
            +-+---------------- 1 ----------------
              | Exception Group Traceback (most recent call last):
              |   File "<cattrs generated structure lsprotocol.types.Range>", line 5, in structure_Range
              |     res['start'] = __c_structure_start(o['start'], __c_type_start)
              |   File "<cattrs generated structure lsprotocol.types.Position>", line 14, in structure_Position
              |     if errors: raise __c_cve('While structuring ' + 'Position', errors, __cl)
              | cattrs.errors.ClassValidationError: While structuring Position (1 sub-exception)
              | Structuring class Range @ attribute start
              +-+---------------- 1 ----------------
                | Traceback (most recent call last):
                |   File "<cattrs generated structure lsprotocol.types.Position>", line 5, in structure_Position
                |     res['line'] = __c_structure_line(o['line'])
                | TypeError: int() argument must be a string, a bytes-like object or a real number, not 'NoneType'
                | Structuring class Position @ attribute line
                +------------------------------------
              +---------------- 2 ----------------
              | Exception Group Traceback (most recent call last):
              |   File "<cattrs generated structure lsprotocol.types.Range>", line 10, in structure_Range
              |     res['end'] = __c_structure_end(o['end'], __c_type_end)
              |   File "<cattrs generated structure lsprotocol.types.Position>", line 14, in structure_Position
              |     if errors: raise __c_cve('While structuring ' + 'Position', errors, __cl)
              | cattrs.errors.ClassValidationError: While structuring Position (1 sub-exception)
              | Structuring class Range @ attribute end
              +-+---------------- 1 ----------------
                | Traceback (most recent call last):
                |   File "<cattrs generated structure lsprotocol.types.Position>", line 5, in structure_Position
                |     res['line'] = __c_structure_line(o['line'])
                | TypeError: int() argument must be a string, a bytes-like object or a real number, not 'NoneType'
                | Structuring class Position @ attribute line
                +------------------------------------

ERROR:pygls.protocol:Error receiving data
  + Exception Group Traceback (most recent call last):
  |   File "/…/myenv/lib/python3.10/site-packages/pygls/protocol.py", line 404, in _deserialize_message
  |     return self._converter.structure(data, request_type)
  |   File "/…/myenv/lib/python3.10/site-packages/cattrs/converters.py", line 309, in structure
  |     return self._structure_func.dispatch(cl)(obj, cl)
  |   File "<cattrs generated structure lsprotocol.types.TextDocumentCodeActionRequest>", line 26, in structure_TextDocumentCodeActionRequest
  |     if errors: raise __c_cve('While structuring ' + 'TextDocumentCodeActionRequest', errors, __cl)
  | cattrs.errors.ClassValidationError: While structuring TextDocumentCodeActionRequest (1 sub-exception)
  +-+---------------- 1 ----------------
    | Exception Group Traceback (most recent call last):
    |   File "<cattrs generated structure lsprotocol.types.TextDocumentCodeActionRequest>", line 10, in structure_TextDocumentCodeActionRequest
    |     res['params'] = __c_structure_params(o['params'], __c_type_params)
    |   File "<cattrs generated structure lsprotocol.types.CodeActionParams>", line 31, in structure_CodeActionParams
    |     if errors: raise __c_cve('While structuring ' + 'CodeActionParams', errors, __cl)
    | cattrs.errors.ClassValidationError: While structuring CodeActionParams (1 sub-exception)
    | Structuring class TextDocumentCodeActionRequest @ attribute params
    +-+---------------- 1 ----------------
      | Exception Group Traceback (most recent call last):
      |   File "<cattrs generated structure lsprotocol.types.CodeActionParams>", line 15, in structure_CodeActionParams
      |     res['context'] = __c_structure_context(o['context'], __c_type_context)
      |   File "<cattrs generated structure lsprotocol.types.CodeActionContext>", line 21, in structure_CodeActionContext
      |     if errors: raise __c_cve('While structuring ' + 'CodeActionContext', errors, __cl)
      | cattrs.errors.ClassValidationError: While structuring CodeActionContext (1 sub-exception)
      | Structuring class CodeActionParams @ attribute context
      +-+---------------- 1 ----------------
        | Exception Group Traceback (most recent call last):
        |   File "<cattrs generated structure lsprotocol.types.CodeActionContext>", line 5, in structure_CodeActionContext
        |     res['diagnostics'] = __c_structure_diagnostics(o['diagnostics'], __c_type_diagnostics)
        |   File "/…/myenv/lib/python3.10/site-packages/cattrs/converters.py", line 510, in _structure_list
        |     raise IterableValidationError(
        | cattrs.errors.IterableValidationError: While structuring typing.List[lsprotocol.types.Diagnostic] (2 sub-exceptions)
        | Structuring class CodeActionContext @ attribute diagnostics
        +-+---------------- 1 ----------------
          | Exception Group Traceback (most recent call last):
          |   File "/…/myenv/lib/python3.10/site-packages/cattrs/converters.py", line 502, in _structure_list
          |     res.append(handler(e, elem_type))
          |   File "<cattrs generated structure lsprotocol.types.Diagnostic>", line 56, in structure_Diagnostic
          |     if errors: raise __c_cve('While structuring ' + 'Diagnostic', errors, __cl)
          | cattrs.errors.ClassValidationError: While structuring Diagnostic (1 sub-exception)
          | Structuring typing.List[lsprotocol.types.Diagnostic] @ index 0
          +-+---------------- 1 ----------------
            | Exception Group Traceback (most recent call last):
            |   File "<cattrs generated structure lsprotocol.types.Diagnostic>", line 5, in structure_Diagnostic
            |     res['range'] = __c_structure_range(o['range'], __c_type_range)
            |   File "<cattrs generated structure lsprotocol.types.Range>", line 14, in structure_Range
            |     if errors: raise __c_cve('While structuring ' + 'Range', errors, __cl)
            | cattrs.errors.ClassValidationError: While structuring Range (2 sub-exceptions)
            | Structuring class Diagnostic @ attribute range
            +-+---------------- 1 ----------------
              | Exception Group Traceback (most recent call last):
              |   File "<cattrs generated structure lsprotocol.types.Range>", line 5, in structure_Range
              |     res['start'] = __c_structure_start(o['start'], __c_type_start)
              |   File "<cattrs generated structure lsprotocol.types.Position>", line 14, in structure_Position
              |     if errors: raise __c_cve('While structuring ' + 'Position', errors, __cl)
              | cattrs.errors.ClassValidationError: While structuring Position (1 sub-exception)
              | Structuring class Range @ attribute start
              +-+---------------- 1 ----------------
                | Traceback (most recent call last):
                |   File "<cattrs generated structure lsprotocol.types.Position>", line 5, in structure_Position
                |     res['line'] = __c_structure_line(o['line'])
                | TypeError: int() argument must be a string, a bytes-like object or a real number, not 'NoneType'
                | Structuring class Position @ attribute line
                +------------------------------------
              +---------------- 2 ----------------
              | Exception Group Traceback (most recent call last):
              |   File "<cattrs generated structure lsprotocol.types.Range>", line 10, in structure_Range
              |     res['end'] = __c_structure_end(o['end'], __c_type_end)
              |   File "<cattrs generated structure lsprotocol.types.Position>", line 14, in structure_Position
              |     if errors: raise __c_cve('While structuring ' + 'Position', errors, __cl)
              | cattrs.errors.ClassValidationError: While structuring Position (1 sub-exception)
              | Structuring class Range @ attribute end
              +-+---------------- 1 ----------------
                | Traceback (most recent call last):
                |   File "<cattrs generated structure lsprotocol.types.Position>", line 5, in structure_Position
                |     res['line'] = __c_structure_line(o['line'])
                | TypeError: int() argument must be a string, a bytes-like object or a real number, not 'NoneType'
                | Structuring class Position @ attribute line
                +------------------------------------
          +---------------- 2 ----------------
          | Exception Group Traceback (most recent call last):
          |   File "/…/myenv/lib/python3.10/site-packages/cattrs/converters.py", line 502, in _structure_list
          |     res.append(handler(e, elem_type))
          |   File "<cattrs generated structure lsprotocol.types.Diagnostic>", line 56, in structure_Diagnostic
          |     if errors: raise __c_cve('While structuring ' + 'Diagnostic', errors, __cl)
          | cattrs.errors.ClassValidationError: While structuring Diagnostic (1 sub-exception)
          | Structuring typing.List[lsprotocol.types.Diagnostic] @ index 1
          +-+---------------- 1 ----------------
            | Exception Group Traceback (most recent call last):
            |   File "<cattrs generated structure lsprotocol.types.Diagnostic>", line 5, in structure_Diagnostic
            |     res['range'] = __c_structure_range(o['range'], __c_type_range)
            |   File "<cattrs generated structure lsprotocol.types.Range>", line 14, in structure_Range
            |     if errors: raise __c_cve('While structuring ' + 'Range', errors, __cl)
            | cattrs.errors.ClassValidationError: While structuring Range (2 sub-exceptions)
            | Structuring class Diagnostic @ attribute range
            +-+---------------- 1 ----------------
              | Exception Group Traceback (most recent call last):
              |   File "<cattrs generated structure lsprotocol.types.Range>", line 5, in structure_Range
              |     res['start'] = __c_structure_start(o['start'], __c_type_start)
              |   File "<cattrs generated structure lsprotocol.types.Position>", line 14, in structure_Position
              |     if errors: raise __c_cve('While structuring ' + 'Position', errors, __cl)
              | cattrs.errors.ClassValidationError: While structuring Position (1 sub-exception)
              | Structuring class Range @ attribute start
              +-+---------------- 1 ----------------
                | Traceback (most recent call last):
                |   File "<cattrs generated structure lsprotocol.types.Position>", line 5, in structure_Position
                |     res['line'] = __c_structure_line(o['line'])
                | TypeError: int() argument must be a string, a bytes-like object or a real number, not 'NoneType'
                | Structuring class Position @ attribute line
                +------------------------------------
              +---------------- 2 ----------------
              | Exception Group Traceback (most recent call last):
              |   File "<cattrs generated structure lsprotocol.types.Range>", line 10, in structure_Range
              |     res['end'] = __c_structure_end(o['end'], __c_type_end)
              |   File "<cattrs generated structure lsprotocol.types.Position>", line 14, in structure_Position
              |     if errors: raise __c_cve('While structuring ' + 'Position', errors, __cl)
              | cattrs.errors.ClassValidationError: While structuring Position (1 sub-exception)
              | Structuring class Range @ attribute end
              +-+---------------- 1 ----------------
                | Traceback (most recent call last):
                |   File "<cattrs generated structure lsprotocol.types.Position>", line 5, in structure_Position
                |     res['line'] = __c_structure_line(o['line'])
                | TypeError: int() argument must be a string, a bytes-like object or a real number, not 'NoneType'
                | Structuring class Position @ attribute line
                +------------------------------------

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/…/myenv/lib/python3.10/site-packages/pygls/protocol.py", line 510, in data_received
    self._data_received(data)
  File "/…/myenv/lib/python3.10/site-packages/pygls/protocol.py", line 542, in _data_received
    json.loads(body.decode(self.CHARSET),
  File "/…/myenv/lib/python3.10/json/__init__.py", line 359, in loads
    return cls(**kw).decode(s)
  File "/…/myenv/lib/python3.10/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/…/myenv/lib/python3.10/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
  File "/…/myenv/lib/python3.10/site-packages/pygls/protocol.py", line 418, in _deserialize_message
    raise JsonRpcInvalidParams() from exc

tombh commented

Thanks for the detailed write up. I'm not quite clear on the architecture. It seems like your YaLafi-LS is something like a proxy for messages from LaTeX Workshop? Or put another way: what generates the JSON message you pasted, and why is it sent to Pygls?

YaLafi-LS is an LSP wrapper around YaLafi which is a spell checker for LaTeX documents. And LaTeX Workshop is the editor plugin providing LaTeX code completion et cetera in VS Code, it directly uses the VS Code API (no language server).

YaLafi LS publishes diagnostics with spelling mistakes and supports code action requests to fix these spelling mistakes (all that works quite well). However, if VS Code sends a code action request, it sends all available diagnostics in the affected range (where the cursor currently is) including those of other extensions. In this particular case, it also sends faulty ones from LaTeX Workshop. This of course should not happen and hopefully gets fixed within VS Code.

My intention here is to ask whether it is feasible for pygls to also support LSP messages with wrong types as long as they are within an optional parameter of the message and can safely be discarded. The idea behind that is assuming that similar bugs might occur elsewhere.

A valid answer would be: “No, we require the client to send correct types.”

tombh commented

I think this is an area of LSP I'm not so familiar with. So the YaLafi-LS VSCode JSON extension (and LSP client) is sending Code Action Requests (on behalf of VSCode) to your Pygls implementation of YaLafi-LS? That makes sense.

So yes, it's an interesting area for discussion. Although it is not Pygls' responsibility, it seems like a reasonable feature for it to support, namely allowing optional sections of JSON payloads to be malformed.

My first thought is that I don't know enough about all the possible consequences of that. I could imagine that it might have unforeseen and unexpected consequences in other contexts. We're not just talking about converting an error to a warning, but deleting entire sections of JSON payloads.

It's definitely a reasonable idea to reflect on. But my intuition is that Pygls core shouldn't take responsibility for it. I certainly think Pygls should offer a way to intercept incoming payloads. I believe that has been discussed before although I can find those issues right now.

I did find these issues talking about null values in optional fields: #145, #124

And what about intercepting those payloads in YaLafi-LS's VSCode JS extension?

I think this is an area of LSP I'm not so familiar with. So the YaLafi-LS VSCode JSON extension (and LSP client) is sending Code Action Requests (on behalf of VSCode) to your Pygls implementation of YaLafi-LS? That makes sense.

Yes, I agree up to ‘JSON’.

So yes, it's an interesting area for discussion. Although it is not Pygls' responsibility, it seems like a reasonable feature for it to support, namely allowing optional sections of JSON payloads to be malformed.

My first thought is that I don't know enough about all the possible consequences of that. I could imagine that it might have unforeseen and unexpected consequences in other contexts.

Indeed, I also don’t know whether that could have bad consequences. At least the server should be able to process and answer the request, because it would also do so without the optional sections. I would not try to process sections, when the whole JSON payload is malformed, only if the types are out of spec.

We're not just talking about converting an error to a warning, but deleting entire sections of JSON payloads.

Correct, that comment was more saying, that if something like that would be implemented, one should note somewhere that pygls received malformed JSON payloads, maybe as a warning in the log.

I did find these issues talking about null values in optional fields: #145, #124

These are exactly the opposite, right, where pygls sent malformed JSON and VS Code did not process the message (but also didn’t throw an error)?

It's definitely a reasonable idea to reflect on. But my intuition is that Pygls core shouldn't take responsibility for it.
And what about intercepting those payloads in YaLafi-LS's VSCode JS extension?

For my specific issue, I will just wait, until VS Code handles them correctly and does not send faulty messages. The YaLafi-LS VS Code extension is more or less the same as the json-vscode-extension in this repo and does not do much.

But also in general, if accepting some kind of malformed JSON messages is feasible, it should be done by pygls in my opinion. Isn’t the point of pygls, that one can write language servers in python without dealing with JSON messages directly?

The question I want to bring up here is, whether it would be feasible to handle wrong types less strict?

TLDR: Yes it is possible, but it is currently up to you to define what "less strict" means.


A quick bit of context

As of v1, pygls now relies on lsprotocol for all of it's LSP type definitions. lsprotocol in turn builds on attrs and cattrs to also provide serialisation and deserialisation of LSP types to/from JSON. This is done through a converter object.

>>> from lsprotocol.converters import get_converter
>>> from lsprotocol.types import Position

>>> converter = converters.get_converter()
>>> p = converter.structure({"line": 1, "character": 2}, Position)
>>> p.line
1
>>> p.character
2

Each pygls powered language server receives a default_converter which, is (almost) identical to the converter you'd get direct from lsprotocol. This means (as you've already discovered) it follows the LSP spec to the letter.

Relaxing the rules

However, it is possible to override how the converter parses certain types by registering your own structure hooks that specify how malformed types should be handled.

Here is an example hook where any malformed diagnostics in the context field of a code action request are silently ignored. Invalid diagnostics attached to other types will still be rejected.

# example.py
import json 
import pathlib

from lsprotocol.types import CodeActionContext
from lsprotocol.types import Diagnostic
from lsprotocol.types import TextDocumentCodeActionRequest
from pygls.protocol import default_converter

def my_converter_factory():
    converter = default_converter()

    def relaxed_code_action_context_hook(obj, type_):
        diagnostics = []
        for diag in obj['diagnostics']:
            try:
                diagnostics.append(converter.structure(diag, Diagnostic))
            except Exception:
                pass
                
        return CodeActionContext(diagnostics=diagnostics)
        
    converter.register_structure_hook(CodeActionContext, relaxed_code_action_context_hook)
    
    return converter
    
converter = my_converter_factory()

example_request = json.loads(pathlib.Path('example.json').read_text())
parsed_request = converter.structure(example_request, TextDocumentCodeActionRequest)

print(json.dumps(converter.unstructure(parsed_request), indent=2))
example.json (Your example request, modified to have one valid, one invalid diagnostic)
{
    "jsonrpc": "2.0",
    "id": 10,
    "method": "textDocument/codeAction",
    "params":{
        "textDocument":{ "uri": "file:///.../test.tex" },
        "range":{ 
            "start":{ "line": 7, "character": 14 },
            "end":{ "line": 7, "character": 14 }
        },
        "context":{
            "diagnostics":[
                {
                    "range":{
                        "start":{
                            "line": null,
                            "character": 0
                        },
                        "end":{
                            "line": null,
                            "character": 65535
                        }
                    },
                    "message": "Using \\overbracket and \\underbracket from\n(unicode-math)\t`mathtools' package.\n(unicode-math)\t\n(unicode-math)\tUse \\Uoverbracket and \\Uunderbracket for\n(unicode-math)\toriginal `unicode-math' definition.\n",
                    "severity": 2,
                    "source": "LaTeX"
                },
                {
                    "range":{
                        "start":{
                            "line": 1,
                            "character": 0
                        },
                        "end":{
                            "line": 1,
                            "character": 65535
                        }
                    },
                    "message": "I'm going to overwrite the following commands\n(unicode-math)\tfrom the `mathtools' package: \n(unicode-math)\t\n(unicode-math)\t\\dblcolon, \\coloneqq, \\Coloneqq, \\eqqcolon.\n(unicode-math)\t\n(unicode-math)\t\n(unicode-math)\tNote that since I won't overwrite the other\n(unicode-math)\tcolon-like commands, using them will lead to\n(unicode-math)\tinconsistencies.\n",
                    "severity": 2,
                    "source": "LaTeX"
                }
            ]
        }
    }

}
demo

Note how this request now only contains the single, valid, diagnostic.

> python example.py
{
  "id": 10,
  "params":{
    "textDocument":{ "uri": "file:///.../test.tex" },
    "range":{ 
        "start":{ "line": 7, "character": 14 },
        "end":{ "line": 7, "character": 14 }
    },
    "context": {
      "diagnostics": [
        {
          "range": {
            "start": {
              "line": 1,
              "character": 0
            },
            "end": {
              "line": 1,
              "character": 65535
            }
          },
          "message": "I'm going to overwrite the following commands\n(unicode-math)\tfrom the `mathtools' package: \n(unicode-math)\t\n(unicode-math)\t\\dblcolon, \\coloneqq, \\Coloneqq, \\eqqcolon.\n(unicode-math)\t\n(unicode-math)\t\n(unicode-math)\tNote that since I won't overwrite the other\n(unicode-math)\tcolon-like commands, using them will lead to\n(unicode-math)\tinconsistencies.\n",
          "severity": 2,
          "source": "LaTeX"
        }
      ]
    }
  },
  "method": "textDocument/codeAction",
  "jsonrpc": "2.0"
}

This behavior may or may not work for you. Say for example you wanted to handle invalid diagnostics the same regardless of the context in which they appeared. Another option would be to provide a hook for the Diagnostic type directly that converts "line": null to "line": 0 perhaps

pygls integration

To use your custom converter with YaLafi-LS you can then pass my_converter_factory to your server's constructor

server = LanguageServer(
    name="my-language-server", version="v1.0", converter_factory=my_converter_factory
)

Thank you for this very detailed answer @alcarney. It works perfectly.

It might be possible to generalize that using a structure_hook for lists, and it might even be possible to use a structure_hook_factory for optional types (haven’t looked into this carefully). But it seems to me, that this is generally not wanted. Otherwise, I would expect something like that in the cattrs package (thanks for pointing me in that direction).

Would you like to include something like that as an option in pygls, or would you want a little documentation about what you wrote above. I am happy to invest some time here.

I am happy to invest some time here.

Thanks very much! 😄

Personally, I'd start with a bit of documentation to show that it's possilble. If a lot of servers start implementing their own non-strict mode it might highlight some common patterns that pygls could then offer out of the box.

One potential hook we could add to a... 🤔 relaxed_converter()(?) in pygls would be one that clamps invalid line/character indices in Position objects to their max/min bounds.

Context: vscode-restructuredtext/vscode-restructuredtext#429 & vscode-restructuredtext/vscode-restructuredtext#433