libsdl-org/SDL_ttf

[Feature request] More text-shaping functionality.

LucidSigma opened this issue · 7 comments

Font language.

This library currently has support for setting a font's script (via TTF_SetFontScriptName) and text direction (via TTF_SetFontDirection) to assist with text-shaping using HarfBuzz. However, HarfBuzz has another shaping setting — which is language. This is done with hb_buffer_set_language.

A text's language is separate from its script. For example: Arabic, Farsi, Urdu, and Sindhi all use an Arabic script, but have different alphabets and shaping rules.

I suggest the following API (to be consistent with the aforementioned functions):

/*
    font: the font to specify the language for.
    language_bcp47: a null-terminated string containing the desired language's BCP47 code.
    returns: 0 on success, -1 on error.
*/
int TTF_SetFontLanguage(TTF_Font *font, const char *language_bcp47);

Font beginning/end flags.

Some more advanced shaping features can be set using hb_buffer_set_flags. The list of available flags can be found here. As mentioned, most of these are advanced and out-of-scope for SDL_ttf; however I do feel that HB_BUFFER_FLAG_BOT and HB_BUFFER_FLAG_EOT could be implemented in SDL_ttf. These flags basically say that the shaping buffer's text is at the beginning/end of a paragraph (as some languages have specific text-shaping rules in these contexts).

Similarly to the previous suggestion, here is what the example API could look like herefor:

/*
    font: the font to specify the beginning flag for.
    is_beginning: 0 to specify not at beginning of paragraph, non-zero otherwise.
*/
void TTF_SetFontBeginningOfParagraph(TTF_Font *font, int is_beginning);

/*
    font: the font to specify the end flag for.
    is_end: 0 to specify not at end of paragraph, non-zero otherwise.
*/
void TTF_SetFontEndOfParagraph(TTF_Font *font, int is_end);

Alternatively, we could also create a TTF_ShapingFlags enumeration that mimics the hb_buffer_flags_t enumeration and have a function such as TTF_SetFontShapingFlags(TTF_Font *font, TTF_ShapingFlags flags) to set the flags (although this enumeration will have to be updated with each HarfBuzz update to ensure congruence).

I am willing to do a pull request adding this functionality if this is something you wish to add to the library and once the API is solidified.

1bsyl commented

If this does better shaping, then yes, we should provide this.
I can create a PR

1bsyl commented

Using the correct language (in addition to script), change a little bit the glyph/position for some language (and it's said to be orthogonal to script).

"Languages are crucial for selecting which OpenType feature to apply to the buffer which can result in applying language-specific behaviour. Languages are orthogonal to the scripts, and though they are related, they are different concepts and should not be confused with each other."

but, I saw no change testing the end/start of paragraph flags

Awesome! Thanks for doing that.

I did a bit more research into the beginning/end flags. It turns out that the only real use for them (that I could find in the HarfBuzz source) was if a combining character was used at the start of the buffer. It also depended on the HB_BUFFER_FLAG_DO_NOT_INSERT_DOTTED_CIRCLE flag as well. I was under the impression that some languages had certain shaping rules in such contexts, but this doesn't seem to be the case.

After seeing this, I feel that supporting such flags would be a bit too niche and bespoke for SDL_ttf (especially if it adds new symbols to the API). If someone really needs to set the buffer flags, then they're most likely just better off just using HarfBuzz directly.

1bsyl commented

ok, that makes sense. if you cut the buffer, in the middle of a cluster (because of fixed buffer size maybe), the character cannot be rendered.

please give a try to the SetLanguage API, it sounds it should be added.

If you think these flags should be supported, then I'll implement them.

I think we should support all of the buffer flags that HarfBuzz provides, since some are coupled with each other (for example: HB_BUFFER_FLAG_BOT and HB_BUFFER_FLAG_BOT).

I suggest the following API for them:

/**
 * Shaping flags
 *
 * \sa TTF_SetFontShapingFlags
 */
typedef enum
{
  TTF_SHAPING_FLAG_DEFAULT                        = 0x00000000,    /* Use default HarfBuzz buffer settings */
  TTF_SHAPING_FLAG_BEGINNING_OF_TEXT              = 0x00000001,    /* Input text contains beginning of a paragraph */
  TTF_SHAPING_FLAG_END_OF_TEXT                    = 0x00000002,    /* Input text contains end of a paragraph */
  TTF_SHAPING_FLAG_PRESERVE_DEFAULT_IGNORABLES    = 0x00000004,    /* Render characters with Default_Ignorable Unicode property instead of hiding them */
  TTF_SHAPING_FLAG_REMOVE_DEFAULT_IGNORABLES      = 0x00000008,    /* Remove characters with Default_Ignorable Unicode property instead of hiding them */
  TTF_SHAPING_FLAG_DO_NOT_INSERT_DOTTED_CIRCLE    = 0x00000010,    /* Do not render dotted circle for incorrect character sequences */
  TTF_SHAPING_FLAG_VERIFY                         = 0x00000020,    /* Perform various verification processes during shaping */
  TTF_SHAPING_FLAG_PRODUCE_UNSAFE_TO_CONCAT       = 0x00000040,    /* Indicates that if input text is changed on one side of the beginning of the cluster, then the shaping results for the other side might change (will incur a cost) */
  TTF_SHAPING_FLAG_PRODUCE_SAFE_TO_INSERT_TATWEEL = 0x00000080,    /* Allows tatweel/kashida to be produced during shaping */
  TTF_SHAPING_FLAG_ALL_DEFINED                    = 0x000000FF,    /* All defined shaping flags */
} TTF_ShapingFlags;

/**
 * Set buffer flags to be used for text shaping by a font.
 *
 * If SDL_ttf was not built with HarfBuzz support, this function returns -1.
 *
 * \param font the font to specify a direction for.
 * \param flags 0, or one or more TTF_ShapingFlags OR'd together
 * \returns 0 on success, or -1 on error.
 *
 * \since This function is available since SDL_ttf 3.0.0.
 */
extern DECLSPEC int SDLCALL TTF_SetFontShapingFlags(TTF_Font *font, TTF_ShapingFlags flags);

The TTF_ShapingFlags enumeration basically contains all of the values from hb_buffer_flags_t.
The TTF_SetFontShapingFlags function sets the buffer flags for the respective font.

I'll await your feedback for this and then I'll implement it and create a pull request.

1bsyl commented

I don´t think we really need them.
the ttf api allows text strings with any length, so that you can always render the string you want with the api.
either, you cannot really cut the string and render in two parts, because you would have two sdl_surface in the end.

but maybe I am wrong and not really familiar with language that would produce a wrong output.
(if you have some counter-example ...)

I definitely agree. Most of these flags are for advanced/multi-step shaping — and I feel if anyone really needs them, then they're probably better off making their own font renderer instead of using SDL_ttf.

Unless there's anything else, I'd say that we can close this issue once your PR is merged.