HOST-Oman/scribus

Provide typographically correct ligature setting for German

Closed this issue · 2 comments

Provide typographically correct ligature setting for German:

For most languages, you simply use ligatures everywhere in your text. German typography is different. German has many compound words: They are composed by various morphemes. In traditional german typography, you use ligatures within the morphemes, but you do not use ligatures across morpheme boundaries. Example: “Auffahrt” is composed of two morphemes: Auf-fahrt. You should not use a ff ligature here. “Schiff” is composed of one single morphem: Schiff. And within this morpheme, you should use therefore the ff ligature.

The standard way to do correct ligature setting is enabling OpenType ligatures and introducing zero width non-joiners (ZWNJ, Unicode U+200C) there where ligatures have to be suppressed.

Doing this work manually is annoying. It would be more comfortable to do this automatically, like for hyphenation.

As discussed in #144 it might be the best to do this in a script.

The attachment contains such a script.

Technical approach: As the problem of German ligature setting is similar to the problem of hyphenation, the script uses a Python port of the Tex hyphenation algorithm (original author Ned Batchelder, Public Domain). It contains a pseudo-hyphenation pattern especially for ligature setting. This pattern was generated from the hyphenation word list of the “Trennmuster” project (new German hyphenation patterns for Tex). The script has been written with the script API problem “UTF16 code units ≠ Unicode Scalar Values” in mind and should not cause any problems in this sense (even if later the script API would use UTF32 code units). There are also unit testes for this script. The script applies the German ligature typesetting to all the selected text frames and their underlying story (regardless if individual words have the attribute “German” in the character style or not). The script has been tested under Scribus CTL Linux and seems to work (script and unit tests). I’ve tested it also under Scribus 1.5.2 (normal, without CTL) Windows: The script works, but I do not get the unit tests started.

The script inserts and/or removes ZWNJ. With the new Scribus CTL grapheme cluster based cursor movements, the text is still reasonable comfortable to edit even with ZWNJ within it.

Possible issues:
‣ English GUI texts might need a review (my English is terrible).
‣ Automatic hyphenation will not work correctly when ZWNJ characters are in the text. But I would consider this as a problem of the hyphenation code. Workaround: Do hyphenation first, and ligature setting after.

I propose to include the script in Scribus.

Ligatursatzprojekt.zip

Great work @sommerluk. I don't know what is the process to include scripts into scribus mainstream but I if it fix a problem or make things easier for people I don't mind to include it.

@sommerluk please send a pull request to add your script.