libsdl-org/SDL_ttf

Complex text layout support

SDLBugzilla opened this issue · 5 comments

This bug report was migrated from our old Bugzilla tracker.

These attachments are available in the static archive:

Reported in version: unspecified
Reported for operating system, platform: All, All

Comments on the original bug report:

On 2015-12-22 06:36:59 +0000, anood almuharbi wrote:

Created attachment 2342
patch

This patch adds support for languages that require "complex text layout" (https://en.wikipedia.org/wiki/Complex_text_layout).

We are using "libraqm" (https://github.com/HOST-Oman/libraqm), a small source code-only library that wraps FriBidi (for bidirectional text support) and HarfBuzz (for text shaping), and does proper BiDi and script itemization.

The CTL support is enabled by default but can be disabled at compiling time, and we provide a fallback function that uses your original code without CTL support.

On 2015-12-22 10:41:02 +0000, Sylvain wrote:

Hi,

You might be interested by this patch:
https://bugzilla.libsdl.org/show_bug.cgi?id=3046

This is SDL_ttf on HarfBuzz + FreeType.
But there is no dependency to libraqm neither FreeBidi.

It's up to the user to do BiDi, ie to reverse the string if needed, inverse parenthesis, so on.
For instance, for RTL language, I only reverse the string, then tell harfbuzz to shape as LTR, and let SDL_tff renders naturally as LTR. It works correctly.

Also, I notice, it was important to be able to tell the "script" used to HarfBuzz, otherwise the text shaping was incomplete.
Cheers,

On 2015-12-22 11:21:52 +0000, Khaled Hosny wrote:

(In reply to Sylvain from comment # 1)

Hi,

You might be interested by this patch:
https://bugzilla.libsdl.org/show_bug.cgi?id=3046

This is SDL_ttf on HarfBuzz + FreeType.
But there is no dependency to libraqm neither FreeBidi.

It's up to the user to do BiDi, ie to reverse the string if needed, inverse
parenthesis, so on.

I think it is better to do BiDi in one place i.e in SDL_ttf, experience have shown that most developers don’t have enough experience to do it correctly.

For instance, for RTL language, I only reverse the string, then tell
harfbuzz to shape as LTR, and let SDL_tff renders naturally as LTR. It works
correctly.

This won’t work for Arabic or any RTL language that require shaping, so I’m surprised it worked for you.

Also, I notice, it was important to be able to tell the "script" used to
HarfBuzz, otherwise the text shaping was incomplete.

True, and libraqm does this. It handles script itemisation and resolve the script of common characters, and even handling paired characters.

On 2015-12-22 11:25:19 +0000, Khaled Hosny wrote:

Further more, linraqm is a source only library so it adds no dependency of its own since it should be included with SDL_ttf. We wrote it as a collection of common code we are using to added complex text support to several projects.

On 2015-12-22 12:33:34 +0000, Sylvain wrote:

For RTL, it works.
but this is the sum of 3 awkward things :
reverse the string buffer.
ask hb to shape for LTR
SDL_ttf naturally draw as LTR.

I don't really see where you set the HB_SCRIPT_. E.g. HB_SCRIPT_LATIN, HB_SCRIPT_ARABIC, HB_SCRIPT_HAN, HB_SCRIPT_HEBREW, etc.
without this, I am absolutely sure I got incomplete text shaping. Maybe not for latin, arabic, but for some there was missing glyph substitutions.

Anyway, I would also prefer to have everything in SDL_tff ...

On 2015-12-22 17:46:20 +0000, Khaled Hosny wrote:

(In reply to Sylvain from comment # 4)

For RTL, it works.
but this is the sum of 3 awkward things :
reverse the string buffer.
ask hb to shape for LTR
SDL_ttf naturally draw as LTR.

I’m pretty sure this would have several problems, it might seem to work at the surface but it is not the way how RTL text should be passed to HarfBuzz and will fail in many situations.

I don't really see where you set the HB_SCRIPT_. E.g. HB_SCRIPT_LATIN,
HB_SCRIPT_ARABIC, HB_SCRIPT_HAN, HB_SCRIPT_HEBREW, etc.
without this, I am absolutely sure I got incomplete text shaping. Maybe not
for latin, arabic, but for some there was missing glyph substitutions.

The code in itemize_by_script does the script detection, and we then use it in harfbuzz_shape.

On 2015-12-23 14:52:47 +0000, Sylvain wrote:

I agree with you there should be issues (like context pattern substitution that wouldn't match) but I think I doubled check my renderings when I wrote that, and I just found it was a good workaround for some other issue.

In doubt, I changed to the correct way. Not a single difference in rendering! pixels are equals.

In fact, this is even easier:

For a plain RTL string, just set the direction as RTL to HarfBuzz, and SDL_ttf+HarfBuzz is just able to render it browsing from 0 to len-1 the "hb_glyph_info_t".

On top of SDL_ttf, I have also changed a little bit my thin layer that allows to render strings containing both RTL + LTR.

I think, there is another issue in text rendering that would be important to take care of.
Many times, the string to render cannot be rendered with only one font.
It can also happens with RTL. For instance, you have a RTL string that contains some latin LTR sub-string.
Assuming the RTL font is not able to render the LTR string.

If SDL_ttf handles the shaping of bi-direction text support, it would be also important to render from a set of fonts.

On 2015-12-23 15:38:15 +0000, Khaled Hosny wrote:

(In reply to Sylvain from comment # 6)

I agree with you there should be issues (like context pattern substitution
that wouldn't match) but I think I doubled check my renderings when I wrote
that, and I just found it was a good workaround for some other issue.

In doubt, I changed to the correct way. Not a single difference in
rendering! pixels are equals.

Try an RTL string containing parenthesis (or other characters with Bidi_Mirroring property), they will not be mirrored if text is shaped LTR. Mirroring them in the input string is not enough as it will not handle characters that has no encoded mirror in Unicode, and there is also features like ltra/rtla and ltrm/rtlm that depend on passing the correct direction to HarfBuzz.

I think, there is another issue in text rendering that would be important to
take care of.
Many times, the string to render cannot be rendered with only one font.
It can also happens with RTL. For instance, you have a RTL string that
contains some latin LTR sub-string.
Assuming the RTL font is not able to render the LTR string.

If SDL_ttf handles the shaping of bi-direction text support, it would be
also important to render from a set of fonts.

Current SDL_ttf API works with single font at a time, I agree that supporting font fallback is important, but it will require a new API and an entirely different set of dependencies (if one wants to integrate with system font management APIs).

On 2015-12-28 05:12:07 +0000, anood almuharbi wrote:

Created attachment 2348
new patch

I have attached new patch , that fix small issue in the fall back function

On 2015-12-28 19:47:10 +0000, Philipp Wiesemann wrote:

Instead of malloc()/free() the wrappers SDL_malloc()/SDL_free() should be used (to improve portability).

On 2015-12-30 05:22:55 +0000, anood almuharbi wrote:

Created attachment 2352
new patch

I have added a new patch , to replace free() and malloc() with SDL_free() and SDL_malloc().

On 2015-12-30 05:24:00 +0000, anood almuharbi wrote:

(In reply to Philipp Wiesemann from comment # 9)

Instead of malloc()/free() the wrappers SDL_malloc()/SDL_free() should be
used (to improve portability).

Thank you sir :)

On 2016-01-05 10:38:35 +0000, anood almuharbi wrote:

Created attachment 2353
new patch

I have uploaded a new patch , I forgot to add raqm.c and raqm.h in the previous patch.

On 2016-01-05 10:52:00 +0000, anood almuharbi wrote:

Created attachment 2354
new patch

added a new patch to support changes in libraqm.

On 2016-01-24 08:48:35 +0000, anood almuharbi wrote:

Created attachment 2368
new patch

We are using now a standalone library 'Raqm', that wraps FriBidi (for bidirectional
text support) and HarfBuzz (for text shaping), and does proper BiDi and scrip itemization.

Raqm support is enabled by default but can be disabled at compiling time.

https://github.com/HOST-Oman/libraqm

On 2016-01-26 21:43:39 +0000, Philipp Wiesemann wrote:

The function text_layout() from the patch calls exit(). This makes it more difficult for applications to handle errors (e.g. show a message or save files before abort).

On 2016-01-27 07:09:11 +0000, anood almuharbi wrote:

Created attachment 2369
new patch

Replace exit with return.

On 2016-01-27 07:10:21 +0000, anood almuharbi wrote:

(In reply to Philipp Wiesemann from comment # 15)

The function text_layout() from the patch calls exit(). This makes it more
difficult for applications to handle errors (e.g. show a message or save
files before abort).

You are right :), thanx.

On 2016-02-26 18:08:21 +0000, Khaled Hosny wrote:

Is there any interest from SDL_ttf maintainer(s) in this work, or what be the better course of action is to work out some other way (a different library?) to support complex text scripts in SDL?

On 2018-09-10 08:43:13 +0000, wrote:

The patch still has a bunch of issue. Namely it mixes up char/index when accessing the cache thus leading incorrect characters being displayed and it doesn't free raqm_t properly.

I attempted to fix those issue:

SuperTux/SDL_ttf@be4e166

SuperTux/SDL_ttf@25588c5

But given that there is still a bunch of Find_Glyph() in the code further fixing might be needed.

On 2018-10-19 13:15:38 +0000, Sylvain wrote:

I have also integrated a text shaping library, and I have extracted the part where I replaced the SDL_ttf Glyph cache system by an Index cache system.

I proposed the patch as I thought it was reducing the number of collision (hence an improvement) but that was a mistake. In the end, it's similar.

Nevertheless, the Index cache system is required to run a text shaping engine.
So you can have a look, maybe it will help you to fix your issue (bug 4307).

@1bsyl, is this resolved with the current code in the repo?

@slouken not the OP, but the current code works for me for rendering Arabic text in RTL mode when compiled with HarfBuzz support. I just make the following calls before calling other SDL_ttf function:

TTF_SetDirection(HB_DIRECTION_RTL); /* from HarfBuzz hb_direction_t */
TTF_SetScript(HB_SCRIPT_ARABIC); /* from HarfBuzz hb_script_t */

And to go back to normal LTR rendering I do the reverse:

TTF_SetDirection(HB_DIRECTION_LTR);
TTF_SetScript(HB_SCRIPT_COMMON);

Unlike the bug comments, I did not need to do reversing of anykind.

Great, thanks!
I'll close this for now, please let me know if more work needs to be done in the future.

1bsyl commented

Thanks!

@slouken
Btw, the casting isn't correct in SDL_ttf.h
It uses 'int' to compile even if SDL_ttf isn't build with harfbuzz
Not sure if we should change it to hb_direction_t / hb_script_t

334 extern DECLSPEC int SDLCALL TTF_SetDirection(int direction); /* hb_direction_t */
335 extern DECLSPEC int SDLCALL TTF_SetScript(int script); /* hb_script_t */

No, because those aren't defined. If you wanted to do something tricky, you could #ifdef based on whether the harfbuzz header is included, but I'd rather leave it as-is and let application code cast as needed.