Ability to fall back through multiple fonts if required
smcv opened this issue · 6 comments
This is really a feature request/limitation, but I'll write the issue template as though it was a bug, because that makes it easier to see what's going on. It might be a duplicate of #338, but it isn't 100% clear to me what the scope of #338 is, so I'm erring on the side of opening a separate issue (which can easily be closed as a duplicate if necessary) instead of replying to an existing issue.
This was originally ValveSoftware/Source-1-Games#6058, but that issue report is clouded by implementation details of a workaround that the user had previously used, and speculation about it potentially being a bug/limitation of Steam Linux Runtime containers.
Steps to reproduce the limitation
- Have Google's Noto font family installed, for example
fonts-noto-core
on Debian/Ubuntu.- This includes a font named
Noto Sans
in variantRegular
, which is the default font used when most modern Linux distros ask fontconfig for asans-serif
font.Noto Sans
covers simple alphabet-based writing systems used in Europe (Latin, Greek, Cyrillic) but it does not cover e.g. Arabic, Hebrew, Sinhala or Thai. - It also includes fonts with names like
Noto Sans Hebrew
andNoto Sans Sinhala
, again with aRegular
variant. These fonts cover the extra glyphs required to write in non-European languages.
- This includes a font named
- Have some user-supplied text, in an unknown natural language.
- In the original issue report that I'm adapting, the text is player nicknames and chat messages in Team Fortress 2, which could reasonably be in any language as spoken in any country.
- In the original issue report that I'm adapting, the glyph that the user is using as their example is U+0D9E "SINHALA LETTER KANTAJA NAASIKYAYA" (a glyph used to write the Sinhala language as used in Sri Lanka, according to Wikipedia). This can be typed into a GNOME environment with the sequence
Ctrl+Shift+U, 0, d, 9, e, Space
(hopefully the same in other desktop environments like KDE Plasma, I haven't tried). You could probably get similar results with Arabic, Hebrew or Thai, but Arabic and Hebrew are right-to-left languages which have their own unique display issues, so using Sinhala as our example might be best.
- Format that text for display. The naive way to do this with SDL2 is something like what we do in the Steam Runtime's
steam-runtime-dialog-ui
:- ask Fontconfig for a list of fonts matching a desired family or pseudo-family alias: I used
sans-serif
, but I could have usedNoto Sans
(ttf_load_font()
in https://gitlab.steamos.cloud/steamrt/steam-runtime-tools/-/blob/main/steam-runtime-tools/sdl-utils.c?ref_type=heads) - load the first matching font into SDL_ttf (same function)
- use that to render text (e.g.
dialog_set_message()
in https://gitlab.steamos.cloud/steamrt/steam-runtime-tools/-/blob/main/bin/dialog-ui.c?ref_type=heads which callsTTF_RenderUTF8_Blended_Wrapped()
)
- ask Fontconfig for a list of fonts matching a desired family or pseudo-family alias: I used
I do not have access to TF2 source code, but I suspect it might be doing something rather similar.
You can try this on any Linux system with Steam installed by running:
~/.steam/root/ubuntu12_32/steam-runtime/amd64/usr/bin/steam-runtime-dialog-ui --info --text="$(cat ~/tmp/sinhala.txt)"
where ~/tmp/sinhala.txt
contains the desired text.
Expected result
Ideally, if I try to render a glyph that is not available in Noto Sans
, there should be a way to fall back to fetching that glyph from related fonts like Noto Sans Sinhala
, or (as a last resort) from any installed font with no restrictions - displaying text in a style that doesn't match how we display Latin is better than displaying placeholder boxes.
Actual result
If the desired glyph is not available in Noto Sans
, there is no straightforward way to fall back to Noto Sans Sinhala
, and instead SDL_ttf renders a placeholder rectangle ("tofu").
Non-SDL implementation of expected result
To prove that it's possible:
zenity --info --text="$(cat ~/tmp/sinhala.txt)"
, where~/tmp/sinhala.txt
contains the desired text- Zenity uses GTK, which implements its text rendering with Fontconfig, Pango, Harfbuzz and Freetype. If GTK finds that the glyph it needs is not available in the first font that it tries, then it will try other fonts. (I think this might actually be a Pango feature and not a GTK feature.)
- I get the Sinhala glyph displayed.
If this is out-of-scope for SDL_ttf, then a SDL + Pango wrapper might be an alternative. Unfortunately, "the" SDL + Pango wrapper is stuck in SDL 1.2, with at least two competing forks updating it to SDL 2, and no SDL 3 version that I'm aware of: more information in https://discourse.libsdl.org/t/moving-sdl-pango-to-the-libsdl-org-github-organisation/30886/6.
SDL_ttf only knows about the font that you've provided, and only allows one font in a single text output pass. This definitely seems like something we should remedy in SDL_ttf 3.0
SDL_ttf only knows about the font that you've provided, and only allows one font in a single text output pass.
This definitely seems like something we should remedy in SDL_ttf 3.0
OK, so are you saying this is out-of-scope (working as intended) in SDL_ttf 2.0; but in-scope (actionable feature request) for SDL_ttf 3.0?
SDL_ttf only knows about the font that you've provided, and only allows one font in a single text output pass.
This definitely seems like something we should remedy in SDL_ttf 3.0
OK, so are you saying this is out-of-scope (working as intended) in SDL_ttf 2.0; but in-scope (actionable feature request) for SDL_ttf 3.0?
Yes, that's correct.
Having multiple fonts can solve "tofu". but actually, it's not the only issue.
If you have a string with "some latin chars ... some arabic chars.. " you probably have to use multiple fonts, but also change left-to-right and right-to-left rendering.
Also, beside LTR/RTL, you may also want to change style (italic bold etc), size, breaking, hyphenation, or justification
etc.
There was some issue/patch, but only using 1 font (and so not fixing the tofu things): #66 and #135
I think this can be solved on top of current SDL_ttf API. eg:
check if all chars are render-able with a font (see TTF_GlyphIsProvided()
), otherwise split the string, and render it in multiple parts. This probably won't match the endless specs people are writing, but this have minor caveats.
This allows the possibility to handle style/size/formatting variations.
interesting links:
Unicode Bidirectional Algorithm: https://www.unicode.org/reports/tr9/
What HarfBuzz doesn't do: https://harfbuzz.github.io/what-harfbuzz-doesnt-do.html
@smcv, I'm updating SDL_ttf for SDL 3.0 now, and this is one of the two big things left. I think solving the multiple font problem can be done independently of the other things @1bsyl raised above.
Here's some info on how you can handle multiple fonts with harfbuzz:
https://tex.stackexchange.com/questions/520034/fallback-for-harfbuzz-fonts