alacritty/vte

Implement colon separated CSI parameters

VojtechStep opened this issue ยท 12 comments

I was looking into implementing styled underlines into alacritty, like what you might see in kitty or gnome vte based terminals - that is normal underline, double underline, curly underline (undercurl), and have it have a different color than the foreground.

Looking at the kitty implementation (spec), the terminal emulator has to handle not only semicolon separated parameters (\e[0;4m), but also colon separated parameters (\e[4:3m).

I'm not an expert on the standard defining how those sequences should be handled, so I was going by kitty's source code. What's being done in their codebase is not applicable to this repo, because they have the entire sequence buffered up until the command byte ('m' in this case) and then parse it, dispatching every semicolon separated parameter as it's own sequence, with special cases for color parameters. In that way, a command like \e[0;4;58:2::186:93:0m would be split into three function calls: handle 0m, handle 4m, handle 58:2::186:93:0m.

This could be implemented in the vte crate (I think) by adding a dimension to the params array, and multiple calls to the Performer implementor. Not sure what the performance cost would be though.

jwilm commented

Because control sequences sometimes need to look at multiple parameters together, we can't simply split into multiple function calls.

It looks like xterm implements subparams by basically sticking them into the params list but then also keeping a separate array which indicates which are subparams and which are top levels params. Since this naturally makes it possible to exceed the 16 param limit for normal use, xterm caps the params list at 30 instead.

A similar approach here would have comparable performance to the existing implementation. It would allow the Perform impl to know which are params and which are subparams (albeit not in the most accessible way), and do whatever it pleases. That is to say, it keeps things nicely generalized while exposing pertinent info.

AFAIK the only sequences with subparameters in the standard are colors, which could be handled as a special case by the parser. From what I've seen, this is what other parsers do.

What you suggest is something along the lines of adding a parameter to the csi callback of type &[std::slice::SliceIndex], which could be used to index into the params array to get the ranges of the individual subsequences, right?

The first approach might allow us to keep backward compatibility, but the latter is certainly more generic.

jwilm commented

The parameter format is specified generically such that any control character could be associated with 1 or more params. Here's the relevant section from ECMA-48

image

My suggestion is that we would allow the params list to be larger and also have something like a &[bool] of the same length which indicates which params are actually subparams. I wouldn't actually call it a suggestion though -- this is just what xterm does. We could probably have a nicer data structure in Rust.

A simple example of what I was thinking: Playground link.
The parameters are still passed as a linear slice, but we can index into it and get the subparameter groups by iterating over the range slice.

jwilm commented

Thanks for clarifying with an example. The slice ranges work for me as long as we can avoid dynamic allocations.

jerch commented

Stumbled over your issue here through GH suggestions lol.

Maybe I can share a few insights from our side (we implemented this a few months ago in xterm.js).

Imho the crucial part here is not to limit this to certain final functions or certain param values, as the spec clearly states, that any param can have those substring param extensions (as cited by @jwilm above). I think some emulators get this wrong. Following the spec, this a perfect legal example:

CSI 1:2:999999999 ; 3:4 ; 5: ; :6 <final byte>

Whether those extensive subparams make sense at all is up to the final sequence function to decide. Thus a GP sequence parser should pass subparams along as they occur without losing the associated param they are meant to extend.

Took me a while to find a suitable structure for that, in the end we went with a single parser wide data structure with certain limits (per sequence: up to 32 params, up to 32 subparams in total, a single value clamped to -1 .. 2ยณยน - 1), that gets borrowed by the sequence functions to avoid re-allocations.

Since our parser is based on the VT500 parser described on VT100.net, this subparam handling also applies to DCS params (we did not introduce another action to distinguish between CSI_PARAM and DCS_PARAM). Well, the DCS parsing of the VT500 parser is not ECMA compatible in the first place, thus I think the subparam extension does not hurt anyone in the DCS field. Also all well-defined DCS sequences keep working.

Maybe this helps somewhat to get things sorted out for your parser.

Thanks for the input!

I 100% agree that the parser should not in any way try to interpret the sequence, only provide the library user with information about how the parameters were read.

I guess the main issue I have with the implementation for now is what should the interface look like (I'm also sure this will become less of a problem once I actually formulate the requirements I expect from the interface in this comment, stop sitting on the issue for eight months and actually put together a PR ๐Ÿ˜„)

What I would like to achieve is to support these use cases:

  1. Backwards compatibility - If a user doesn't want to support colon separated parameters or they already have some other heuristics that work with the raw byte slice, it should still be available for use without performance penalties or refactoring. (Note that this would probably still be a breaking change, because the signature of csi_dispatch will likely change)
  2. Ease of use for the new approach - I'll probably start experimenting with the slice ranges and write some tests to get a feeling of how ergonomic it would be to use (see the playground link I provided earlier, except I later realized there is no value in the handler being generic). As for the allocation, I don't believe there is currently a way to make a fixed-length array of non-Copy types elegantly (Range is not Copy, because it implements Iterator).
  3. Ease of use for interpreting inputs that use both colon separated params and semicolon separated params - There shouldn't be much of a cognitive overhead if the user has to accept for example colors in the semicolon separated format. What I mean by this is that if we were to change the API to expose a slice of slices, it might not be very ergonomic, because there wouldn't be a 1:1 correspondence between a slice and a command and the user would have to do some sort of lookahead over slices with one element each, or at least I can't imagine .

This leads me to believe that we have to pass the slice as we do now, and then some additional metadata, which in the end might not be the slice Range, but a custom struct that is Copy, Index and maybe something else.

I will experiment for a bit and open a PR once I feel I have a hint of a solution.

jerch commented

Some more remarks from my side:

Technically the params in the example above would translate into something like this (as "int arrays" in C):

{ { 1, 2, 2147483647 }, { 3, 4 }, { 5, -1 }, { -1, 6 } }

(we use -1 as placeholder for omitted values). While the old impl was just a simple array with index access to the values, this one needs to store the slice offsets somehow. After puzzling around and doing some benchmarks I ended up with these containers instead:

int params[32] = { 1, 3, 5, -1, ... };
int subparams[32] = { { 2, 2147483647 }, { 4 }, { -1 }, { 6 }, ... };

I basically pulled the first value of those subparam groups into the old params array we had before, and put excessive subparams in a new container (+ the slice offsets into a third one). I did not do this slightly more complicated model for backwards compatibility, still it would serve any sequence w'o subparams the old way. Main reason to do this was perf - reshaping everything to use slices was quite a dent in the perf, thus I optimized it for the usual case with no subparams.

I have not looked at your current impl, maybe you can restore parts of that idea as well. Note that the type exhibits the same params interface to functions as before, subparams are only "on top" (if the function cares at all for those).

That seems like a good idea, I will try to take inspiration from it.
However I see that the subparams are a 2d array (of course they are, that's the main point) and in that case I don't think we can avoid constructing temporary arrays when calling the user provided callbacks, as opposed to only passing slices of already constructed memory.

I'll see what I can come up with.

jerch commented

However I see that the subparams are a 2d array (of course they are, that's the main point)...

Yepp, well C plays "nicely" here and would just store those values in linear memory as { 2, 2147483647, 4, -1, 6, ... }, which is hard to grasp (I added the superfluous braces for readability).

The slice indices are stored separately like this (yes in C everything has to be done by hand ๐Ÿ˜ธ):

unsigned short subparam_idx[32] = { 0<<8 | 2, 2<<8 | 3, 3<<8 | 4, 4<<8 | 5, ... };
// where a slice is formed by (subparams_indices at param_idx):
// [ subparam_idx[param_idx] >> 8 .. subparam_idx[param_idx] & 0xFF [   (right exclusive)

which is a quite compact memory layout (with a hard coded limit of 256 subparams due to the short width). This bitpacking makes it slightly worse than without, but gets more than compensated due to fitting into one cache line. The whole subparams accounting can be skipped for usual sequences w'o any subparams, and since they tend to be less than 8 params per sequence, again fit into one cache line, which makes it pretty fast in the end.

Ofc this is made with C in mind, not sure if you can do it similar in rust, which comes with much nicer default types out of the box. In C at least always doing the slicing descent is much worse in runtime than treating the first value special by pulling it "in front".

I don't think we can avoid constructing temporary arrays when calling the user provided callbacks

Please note that VTE should work without std and this should have zero negative performance impact on existing Alacritty benchmarks.

I know, that comment was a supposed to indicate that the implementation will probably go a different way, but apparently I wasn't clear enough ๐Ÿ˜„