Inconsistent rendering of double-width characters and ligatures
Lokaltog opened this issue ยท 25 comments
Some fonts like PragmataPro supports double-width glyphs and ligatures that occupy 5+ columns in the terminal. Strings like [ERROR]
have ligatures that converts the string into a wide glyph. While certain ligatures like !=
, <=
and [ERR]
render just fine, other ligatures and glyphs don't render properly.
As you can see double-width glyphs (some of the symbols in the statusline) are rendered at 50% of their size. The [ERR]
ligature renders fine, but [ERROR]
is rendered at a reduced size and clipped.
I noticed #179 which states that kitty requires a monospaced font. PragmataPro comes in two versions where the version with ligatures is non-monospaced, and the version without ligatures is monospaced. Based on this I'm guessing that proper monospaced fonts can't have glyphs spanning multiple columns, but kitty still renders most ligatures correctly for non-monospaced fonts.
- Is this is the expected behavior?
- Are there any possible workarounds?
- Where is the glyph scale determined (the code that determines that e.g.
[ERROR]
is rendered so small)?
I'd be happy to take a look at this issue and see if I can come up with a solution if you could point me in the right direction (and if this is a feature that you'd like to include in the first place).
A unicode character is rendered as double width if and only if wcwidth() returns its width as 2. Otherwise there is no way for the program running inside the terminal to know how many cells a given character will take. There is also a limit to how many characters a ligature can have. IIRC in kitty it is a total of five. Anything over that will not render correctly. This is for performance reasons.
The code to render ligatures is in fonts.c and freetype.c look for look for shape_run() as the starting point. You can also test rendering easily without needing to run all of kitty by using
kitty -c "from kitty.fonts.render import *; showcase()"
Change the fonts/strings in showcase() to whatever you want.
Finally, kitty is not supposed to use non-monospaced fonts at all, i.e. the font selection logic will ignore non-monospaced fonts. This is because kitty achieves its performance by storing pre-rendered glyphs of characters on the GPU. Non-monospaced fonts dont render well in fixed size rectangles (i.e. cells).
I see, thanks for the reply. I think that a better way to handle this could be to completely ignore ligatures for strings of more than five characters instead of rendering them at the wrong scale like the screenshots above. But I don't know how font rendering is done behind the scenes, and if this is even possible.
I'll check out the code and see if I can find a workaround for this (other than manually removing glyphs that don't render correctly from the font).
Regarding the rendering of double-width characters, could it be a possibility to add a config option to "extend" the wcwidth tables for non-standard fonts? PragmataPro includes icons and glyphs that are double width but not in the correct unicode ranges according to wcwidth. Something like:
font_render_double_width 0xE000-0xE00A, 0xE200-0xE2AF
What is the largest ligature in your font? i.e. how many characters does it have? It could be possible to kjust extend the maximum number of ligatures in kitty from 5 to say 9, without too much of a performance hit.
The problem is if you override wcwidth for a character then the program running in the terminal will think the character is 1 cell wide, while the cursor in the terminal emulator will move two cells for the character. This can lead to all sorts of problems. What you can do is mark some characters as ligatures, so then they and the next character are rendered together as though they were an actual ligature. As long as you are careful to only use these characters followed by a space, there should be no issue.
So you can either ask the font designer to create calt based ligatures for these characters -- then <character + space> becomes a ligature containing just the glyph, just like -> becomes a ligature contianing just the arrow glyph. Or you can add code to kitty to impose these ligatures on all fonts via a config option. Something like
The longest ligature in PragmataPro appears to be 9 characters ([WARNING]
). If it's possible to increase the limit to 9 characters without a performance hit then that would be a nice fix for this issue. It should be noted that this is an edge case and I don't know about other fonts that feature such ligatures, so if it has a noticeable impact on performance then I don't think it's worth increasing the character limit.
I like your idea to create ligatures for the double-width glyphs, that would probably be the cleanest and most compatible solution to this issue (other terminals like Konsole also have unexpected behavior when rendering these double-width glyphs). I'll do some testing and contact the font author about this issue to see if it can be fixed in the font itself instead of adding a hack in kitty.
Note that currently ligatures involving the space character will not work as kitty assumes spaces are not aprt of any ligature sequence as a micro-optimization. This can easily be fixed by removing the
case ' ':
line from font_for_cell. So if you can convince the font designer to implement this ligature, then remember to remove that line before testing in kitty.
I have committed the increase to 9 char ligatures, however, I have no way to test it.
Awesome! Thanks for the heads up regarding the ligature sequences. I just tested the latest commit and the 9 character ligature works perfectly now.
I'm trying to use https://github.com/kiliman/operator-mono-lig with kitty but am running into a similar sounding issue to this one. You'll need to have the operator mono font (not free) to use that repo to generate the ligature enabled version if you want to try to reproduce this issue.
I'm testing with the sequence ->
but finding that kitty renders this as a single width character rather than the double width ligature that we see with, for example, Fira Code.
I'm no expert on this topic, but did render this font in my browser and saw the correct width, and I opened up the font in fontforge and the ligature appears to be double width, which makes me believe that there is probably some additional special case that could be added to kitty to make this happen there too.
I've added some debug prints and adjusted the showcase method that you mentioned above and here is a screenshot:
The top half of this shows the ligature enabled version of the font, and the bottom half is the non-ligature version. You can also see in that screen shot how the ligature arrows render in the terminal instance; look for text= ->
. You'll notice that the vertical positioning looks a bit low also, presumably because it is effectively being scaled without also scaling its vertical position?
I found that if I unconditionally increment the max_num_cells
value computed by next_group
, this glyph renders correctly in the showcase
, but I don't understand the code well enough to know how to adjust the implementation of that function to do this in the correct circumstances. I don't mind poking at this some more, but would appreciate some guidance/context from you on where and how to make this change.
Sorry, I'm confused. max_num_cells is an input parameter, do you mean num_group_cells? Your issue is most likely caused by is_special_glyph failing, i.e. returning false for the ligature glyph. Add some prints there and compare with fira code.
Strike that, looking at your debug prints, the problem is more likely the cluster numbers. Check those.
I got hold of this font, and basically, it cannot be used in kitty, it needs to be fixed. Correctly generated monospace fonts implement ligatures as dummy glyph, so a sequence of n characters becomes n-1 dummy glyphs and one real glyph. Without that, there is no reliable way to know how many cells a given sequence should take, one can try using heuristics based on glyph size, but that is by nature inaccurate. For this font, there are no dummy glyphs which is why it is rendered in one cell. You can test this by comparing the output of
hb-shape --cluster-level=1 --show-extents OperatorMonoLig-Book.otf '==='
[equal_equal_equal.liga=0+1650<78,564,1494,-513>]
with FiraCode
hb-shape --cluster-level=1 --show-extents FiraCode-Regular.otf '==='
[LIG=0+600<0,0,0,0>|LIG=1+600<0,0,0,0>|equal_equal_equal.liga=2+600<-1094,538,1588,-452>]
You should report the problem to the font creator.
Actually, thinking about it, there may be a way to workaround this. Stay tuned.
Hmm, that fix is incomplete, the algorithm needed is a little more subtle than I thought.
Yeah, I think there also needs to be an adjustment to the next_group
function. My test case for this is with ->b
. In vim when c++ syntax highlighting is enabled, this renders as -b
; the positioning isn't correct and I think that we render the b
over the top of the >
; the rest of the line has incorrect positioning also and this leads to confusion while editing it.
I believe that the issue is that the "Soak up a number of codepoints" portion of the code is being too greedy for this case. I added some debug prints:
--- a/kitty/fonts.c
+++ b/kitty/fonts.c
@@ -632,15 +632,23 @@ next_group(Font *font, unsigned int *num_group_cells, unsigned int *num_group_gl
while(nglyphs < glyph_limit && ncells < cell_limit) {
glyph_index glyph_id = info[nglyphs].codepoint;
cluster = info[nglyphs].cluster;
+ printf("next_group: codepoint=%x nglyphs=%u cluster=%u (prev:%u)\n", info[nglyphs].codepoint, ngl
yphs, cluster, previous_cluster);
is_special = is_special_glyph(glyph_id, font, &cell_data);
if (prev_was_special && !is_special) break;
glyphs_in_group[nglyphs++] = glyph_id;
// Soak up a number of codepoints indicated by the difference in cluster numbers.
if (cluster > previous_cluster || nglyphs == 1) {
n = nglyphs == 1 ? 1 : cluster - previous_cluster;
+ printf("n=%u. ncells=%u max_num_cells=%u\n", n, ncells, max_num_cells);
unsigned int before = ncells;
- while(n-- && ncells < max_num_cells) ncells += check_cell_consumed(&cell_data, last_cell);
- if (ncells > before && !is_special) break;
+ while(n-- && ncells < max_num_cells) {
+ printf("iter: cell_data.current_codepoint=%x\n", cell_data.current_codepoint);
+ ncells += check_cell_consumed(&cell_data, last_cell);
+ }
+ printf("ncells=%u before=%u is_special=%d\n", ncells, before, is_special);
+ if (ncells > before && !is_special) {
+ break;
+ }
}
previous_cluster = cluster;
prev_was_special = is_special;
@@ -689,7 +697,15 @@ shape_run(Cell *first_cell, index_type num_cells, Font *font) {
glyph_index first_glyph;
while(run_pos < num_glyphs && cell_pos < num_cells) {
first_glyph = next_group(font, &num_group_cells, &num_group_glyphs, first_cell + cell_pos, info +
run_pos, num_glyphs - run_pos, num_cells - cell_pos, &extra_glyphs);
- /* printf("Group: num_group_cells: %u num_group_glyphs: %u\n", num_group_cells, num_group_glyphs)
; */
+ // Ensure we use up all cells, this is needed, for instance, for
+ // ligature fonts that do not use dummy glyphs (an example of such a
+ // font is Operator Mono Lig)
+ printf("Group: run_pos=%u num_group_cells:%u num_group_glyphs=%u num_glyphs=%u num_cells=%u cell_pos=%u\n",
+ run_pos, num_group_cells, num_group_glyphs, num_glyphs, num_cells, cell_pos);
+ if (run_pos + num_group_glyphs >= num_glyphs) {
+ num_group_cells = MAX(num_group_cells, num_cells - cell_pos);
+ }
+ printf("Group: num_group_cells: %u num_group_glyphs: %u\n", num_group_cells, num_group_glyphs);
render_group(num_group_cells, num_group_glyphs, first_cell + cell_pos, info + run_pos, positions + run_pos, font, first_glyph, &extra_glyphs);
run_pos += num_group_glyphs; cell_pos += num_group_cells;
}
Note that the rendered text in the showcase looks ok; you need to run a shell kitty and type in ->b
to see the misbehavior in action; you'll see the ligatured arrow render as you type ->
, but when you press b
it renders over the arrow and leaves a gap where the b
should have been placed:
I believe that (for this font, not sure about all other cases), we need to infer that we didn't get an explicit glyph with cluster=1 on the second iteration of next_group
when we see cluster=2 and that we should increment ncells
and break out before we increment nglyphs
. In other words, I think that we should see two groups for this sequence with this font; the first should be 1 glyph with 2 cells (->
), and the second 1 glyph and 1 cell (b
).
I'm not sure exactly how to encode that logic in the loop; it seems like there are some special cases and I'm no font rendering expert :)
Basically the loop has to be re-written to not use next_group and instead split into groups remembering previous special/cluster states. I'll get to it when I have a moment.
in case it is useful, here's what I came up with (didn't see your most recent comment until now); it seems to have the right behavior in my limited testing so far. Most of this is comments and debug; the gist of it is that nglyphs
is incremented after looking at the cluster differences, so that the count is correct... for this font :-p
--- a/kitty/fonts.c
+++ b/kitty/fonts.c
@@ -628,20 +628,45 @@ next_group(Font *font, unsigned int *num_group_cells, unsigned int *num_group_gl
Cell *last_cell = cells + max_num_cells;
unsigned int cell_limit = MIN(max_num_cells, LIMIT + 1), glyph_limit = MIN(LIMIT, max_num_glyphs);
bool is_special, prev_was_special = false;
-
+ printf("next_group: enter with cells=%p\n", (void*)cells);
while(nglyphs < glyph_limit && ncells < cell_limit) {
glyph_index glyph_id = info[nglyphs].codepoint;
cluster = info[nglyphs].cluster;
is_special = is_special_glyph(glyph_id, font, &cell_data);
- if (prev_was_special && !is_special) break;
- glyphs_in_group[nglyphs++] = glyph_id;
+ printf("next_group: codepoint=%x nglyphs=%u cluster=%u (prev:%u) "
+ "is_special=%d\n",
+ info[nglyphs].codepoint, nglyphs, cluster, previous_cluster,
+ is_special);
// Soak up a number of codepoints indicated by the difference in cluster numbers.
- if (cluster > previous_cluster || nglyphs == 1) {
- n = nglyphs == 1 ? 1 : cluster - previous_cluster;
+ glyphs_in_group[nglyphs] = glyph_id;
+ if (cluster > previous_cluster || nglyphs == 0) {
+ n = nglyphs == 0 ? 1 : cluster - previous_cluster;
+ printf("n=%u. ncells=%u max_num_cells=%u\n", n, ncells, max_num_cells);
+
+ if (n > 1) {
+ // We didn't see (n-1) cluster values, so those were presumably
+ // associated with the prior glyph.
+ n--;
+ is_special = 0;
+ printf("charging %d against prior cell\n", n);
+ }
+
unsigned int before = ncells;
- while(n-- && ncells < max_num_cells) ncells += check_cell_consumed(&cell_data, last_cell);
- if (ncells > before && !is_special) break;
+ while(n-- && ncells < max_num_cells) {
+ printf("iter: cell_data.current_codepoint=%x\n", cell_data.current_codepoint);
+ ncells += check_cell_consumed(&cell_data, last_cell);
+ }
+ printf("ncells=%u before=%u is_special=%d\n", ncells, before, is_special);
+ if (ncells > before && !is_special) {
+ printf("breaking before %d > %d and !is_special\n", ncells, before);
+ break;
+ }
}
+ if (prev_was_special && !is_special) {
+ printf("break out because prev_was_special and !is_special\n");
+ break;
+ }
+ nglyphs++;
previous_cluster = cluster;
prev_was_special = is_special;
}
@@ -687,9 +712,18 @@ shape_run(Cell *first_cell, index_type num_cells, Font *font) {
unsigned int run_pos = 0, cell_pos = 0, num_group_glyphs, num_group_cells;
ExtraGlyphs extra_glyphs;
glyph_index first_glyph;
+ printf("shape_run num_glyphs=%u, num_cells=%u\n", num_glyphs, num_cells);
while(run_pos < num_glyphs && cell_pos < num_cells) {
first_glyph = next_group(font, &num_group_cells, &num_group_glyphs, first_cell + cell_pos, info + run_pos, num_glyphs - run_pos, num_cells - cell_pos, &extra_glyphs);
- /* printf("Group: num_group_cells: %u num_group_glyphs: %u\n", num_group_cells, num_group_glyphs); */
+ // Ensure we use up all cells, this is needed, for instance, for
+ // ligature fonts that do not use dummy glyphs (an example of such a
+ // font is Operator Mono Lig)
+ printf("\nGroup: run_pos=%u num_group_cells:%u num_group_glyphs=%u num_glyphs=%u num_cells=%u cell_pos=%u\n",
+ run_pos, num_group_cells, num_group_glyphs, num_glyphs, num_cells, cell_pos);
+ if (run_pos + num_group_glyphs >= num_glyphs) {
+ num_group_cells = MAX(num_group_cells, num_cells - cell_pos);
+ }
+ printf("Group: num_group_cells: %u num_group_glyphs: %u\n", num_group_cells, num_group_glyphs);
render_group(num_group_cells, num_group_glyphs, first_cell + cell_pos, info + run_pos, positions + run_pos, font, first_glyph, &extra_glyphs);
run_pos += num_group_glyphs; cell_pos += num_group_cells;
}
I committed the fix to the oplig branch. It passes all existing tests, but probably needs much more testing.
I'll run with it and let you know if I see anything weird; thanks!
Interesting discussion. I'm the one who created the ligatures for Operator Mono. I'm not a font expert and I'm simply generating ligatures using the Glyphs app (glyphsapp.com) automatic ligature support.
My understanding is that in OpenType, there is a ligature lookup table and when the renderer sees a particular sequence, it simply substitutes the glyphs for the ligature glyph. In this case the sequence hyphen + greater will be replaced with the hyphen_greater.liga
glyph. NOTE: All ligatures were designed to have a width equal to the sequence glyph widths (since this was a monospace font). So the hyphen_greater.liga
glyph was 2 characters wide.
I'm not really sure what this project does, or what the issue with the Operator Mono Ligature font is. Does it work properly with Fira Code? I know that Fira Code uses some advanced tricks with OpenType like caret spacing and dealing with long sequences of characters, like +++++ ======. See kiliman/operator-mono-lig#35 and kiliman/operator-mono-lig#37.
Anyway, I will try to help out if possible.
Thanks!
@kiliman I'm also not an expert, but TBH, it sounds like your ligatures are implemented the right way. I don't think you need to change the way you do them as kitty now has support for that approach. I've been running it while working this afternoon and haven't noticed anything weird so far :-)
@kiliman In most monospace fonts, a ligature of n characters is implemented as n-1 dummy glyphs of zero size with fixed advances and one final combined glyph with a negative left side bearing. This allows for things like cursor movement over parts of the ligature. Now in kitty I have implemented code that deals with the single glyph approach to using ligatures that you use, so as far as kitty is concerned there is no need for you to do anything.
You actually have a font called "DejaVuSansMono Nerd Font Mono"? You need to pick a font that has an italic variant that has the same width characteristics as the regular variant, otherwise italic characters will not fit.