openlawlibrary/pygls

Poor experience with recent "Position" changes

Closed this issue · 4 comments

I have just been bitten hard by recent changes and I'd like to offer a critique and suggest some improvements:

The problem that I hit is that up until now I have been using utf16_num_units to convert between python string indexes and client position units (I am doing operational-transform like processing on document changes, so need this low level capability). With recent updates, I get the following error:

DeprecationWarning: 'utf16_num_units' has been deprecated, use 'Position.client_num_units' instead

I think I understand the intent of the message, but here are some of the problems that I encountered:

  • Position up until now has meant lsprotocol.types.Position. After a thorough search, I find lsprotocol.types.Position has no client_num_units method, nor has such a method been monkey patched into the class. So the deprecation message is confusing.
  • Then I started looking at recent commits and discover that a second (!) Position class has been added: pygls.workspace.position.Position. Introducing a type name collision on such a fundamental domain entity seems crazy to me.
  • To make matters even more confusing, pygls.workspace.position.Position does not actually model positions it models encoding-specific position translation, so a better name might be PositionTranslator or PositionCodec
  • Now, the deprecation warning says "use Position.client_num_units instead" but pygls.workspace.position.Position.client_num_units is not a class method -- you can't use that function directly, you need to find an instance of it (where?). I guessed at workspace.position but it is not there. Later I found text_document.position

So, my suggestions are:

  • Rename pygls.workspace.position.Position to pygls.workspace.position_codec.PositionCodec or maybe PositionTranslator
  • store an instance of PositionCodec in the Workspace and TextDocument instances, so that workspace.position_codec and text_document.position_codec can be easily used. I would understand if you decided against adding the workspace field but really, there needs to be only a single codec for the entire workspace, it doesn't need to be instantiated on a per-text-document basis.
  • Fix the DeprecationWarnings in a way that makes them accurate and helpful, for example:
DeprecationWarning: 'utf16_num_units' has been deprecated, use 'text_document.position_codec.client_num_units' instead

I can prepare a PR if you can give me an indication that it will be accepted.

Thank you for living on the edge and catching issues like this before they are released!

so a better name might be PositionTranslator or PositionCodec

Would PositionEncoding work as a name, since that it the part of the spec it was introduced to help with? Or is that too close to lsprotocol.types.PositionEncodingKind?

I would understand if you decided against adding the workspace field but really, there needs to be only a single codec for the entire workspace, it doesn't need to be instantiated on a per-text-document basis.

I think putting an instance on the workspace makes sense

I can prepare a PR if you can give me an indication that it will be accepted.

If you want to open a PR that would be fantastic! :)

Would PositionEncoding work as a name, since that it the part of the spec it was introduced to help with? Or is that too close to lsprotocol.types.PositionEncodingKind?

The issue there is that lsprotocol.types.ServerCapabilities.position_encoding is a Optional[Union[PositionEncodingKind, str]]

so the position_encoding field name is already used in lsprotocol for storing PositionEncodingKind. If we define a PositionEncoding class, we'd also want to name fields of that type position_encoding which is potentially a little confusing. For example:

encoding_kind: lsprotocol.types.PositionEncodingKind # in our code

and

position_encoding: PositionEncoding # our type

but meanwhile in lsprotocol

lsprotocol.types.ServerCapabilities.position_encoding: Optional[Union[PositionEncodingKind, str]]

The alternative that I'm suggesting is:

encoding_kind: lsprotocol.types.PositionEncodingKind # in our code
position_codec: PositionCodec # our type
lsprotocol.types.ServerCapabilities.position_encoding: Optional[Union[PositionEncodingKind, str]] # in lsprotocol

I'm more than happy for you to make the final call.

If we define a PositionEncoding class, we'd also want to name fields of that type position_encoding which is potentially a little confusing.

Makes sense, I'll defer to your judgement :)