Proposal: “universal” text shaping result JSON notation based on hb-shape

Question

Proposal: “universal” text shaping result JSON notation based on hb-shape

Opened this issue 8 years ago · 7 comments

@devongovett @ldo @behdad @axkibe @Jolg42 @miguelsousa @readroberts @davelab6 @JelleBosmaMT @brawer @mhosken @be5invis @Pomax @lptr @bramstein @robmck-ms

Note: For the sake of a better place to file this, I’m choosing this repo.

Proposal: “universal” text shaping result JSON notation based on hb-shape

I’d like to propose a “universal” JSON notation for text shaping notation. Basically, when I feed the text “AV” and some font info into a text shaping engine, I’d like to get out something like:

[
  {
    "g": 1,
    "cl": 0,
    "dx": 0,
    "dy": 0,
    "ax": 160,
    "ay": 0,
    "xb": -1,
    "yb": 163,
    "w": 181,
    "h": -163
  },
  {
    "g": 192,
    "cl": 1,
    "dx": 0,
    "dy": 0,
    "ax": 170,
    "ay": 0,
    "xb": -1,
    "yb": 160,
    "w": 173,
    "h": -161
  }
]

There are a number of text shaping engines available:

HarfBuzz
- harfbuzz.js — Emscripten-ported JS version of HarfBuzz (that link points to an outdated build)
- harfpy — Python bindings to HarfBuzz
fontkit
opentype.js
Compositor
AFDKO — it includes some tools that perform similar tasks
a number of non-opensource engines

They implement full Unicode+OpenType text shaping to a varying degree.

Out of the opensource engines, HarfBuzz is currently probably the most complete, implementing variations and all complex scripts, but does not exist in an up-to-date ready-to-use and easy-to-read JS version, and fontkit, which implements variations and complex scripts, though I don’t know of the completeness.

All those engines can be used to perform the “Unicode in, shaped glyph info out” task. But there is no common notation how to express the “shaped glyph info out” notation which could be used universally to pass the information around, and in particular, to help developers implement the second portion of the task, which usually is some kind of rendering.

Since the rendering can be done via vastly different methods, through placement of rasterized bitmaps (extracted e.g. via FreeType), textures (some OpenGL stuff), SVG outlines or other methods, onto some kind of canvas, of which there are, again, many. Especially with the advent of color font formats, the source font can include PostScript monochrome outlines, TrueType monochrome outlines, SVG outlines and bitmaps, PNG bitmaps and even monochrome bitmaps. And some of those can change depending on variation, hinting or PPM bitmap size.

Fortunately, HarfBuzz comes with a utility called hb-shape which performs shaping by accepting a font, a size (optionally equal to upem), a Unicode text and a set of variation and feature settings, and outputs the result as JSON in the format:

[{"g": <glyph name or index>, "ax": <horizontal advance>, "ay": <vertical advance>, "dx": <horizontal displacement>, "dy": <vertical displacement>, "cl": <glyph cluster index within input>}, ...]

Below are some usage examples of hb-view. Note: hb-view actually outputs them as compact JSON, without indentations, but I’m presenting the examples as prettified JSON for readability.

For “universal machine-readable usage”, I like Example 5 most. For human usage, both glyph ID and glyph name, or only glyph name, would be useful.**

To me it seems that the JSON syntax output by hb-view with the --output-format=json --show-extents --no-glyph-names commandline settings is all that a developer would need to then perform the glyph rendering.

So I’d like to propose that the authors of the other shaping engines, in particular fontkit and opentype.js adopt a similar output. This should be an easy task. This would also allow developers to use different engines, and also you guys to compare results.

This format is well-thought out and quite elegant.

What do you think?

Best,
Adam

Examples

I'm using these fonts in the examples:

NotoSans-VF.ttf variable font
CormorantGaramond-Regular.otf

hb-shape has several options to control what’s being output:

glyph indices vs. glyph names
glyph positions (delta and advance width for both x and y)
glyph extents (base in x and y, width and height)
cluster info, showing the index of the first Unicode codepoint in the input text that corresponds to the output glyph (which helps in mapping “one character to many glyph” and “many characters to one glyph” situations)

Example 1: with glyph IDs and cluster IDs

hb-shape --font-file="NotoSans-VF.ttf" --variations="wght=50,wdth=110" \
  --features="+onum" --text="17" --font-size=256 --output-format=json \
  --no-glyph-names --no-positions

[
  {
    "g": 2565,
    "cl": 0
  },
  {
    "g": 2571,
    "cl": 1
  }
]

Example 2: with glyph IDs, glyph positions and cluster IDs

hb-shape --font-file="NotoSans-VF.ttf" --variations="wght=50,wdth=110" \
  --features="+onum" --text="17" --font-size=256 --output-format=json \
  --no-glyph-names

[
  {
    "g": 2565,
    "cl": 0,
    "dx": 0,
    "dy": 0,
    "ax": 137,
    "ay": 0
  },
  {
    "g": 2571,
    "cl": 1,
    "dx": 0,
    "dy": 0,
    "ax": 137,
    "ay": 0
  }
]

Example 3: with glyph names, glyph positions and cluster IDs

hb-shape --font-file="NotoSans-VF.ttf" --variations="wght=50,wdth=110" \
  --features="+onum" --text="17" --font-size=256 --output-format=json

[
  {
    "g": "one.tosf",
    "cl": 0,
    "dx": 0,
    "dy": 0,
    "ax": 137,
    "ay": 0
  },
  {
    "g": "seven.tosf",
    "cl": 1,
    "dx": 0,
    "dy": 0,
    "ax": 137,
    "ay": 0
  }
]

Example 4: with glyph names, glyph positions, glyph extents and cluster IDs

hb-shape --font-file="NotoSans-VF.ttf" --variations="wght=50,wdth=110" \
  --features="+onum" --text="17" --font-size=256 --output-format=json \
  --show-extents

[
  {
    "g": "one.tosf",
    "cl": 0,
    "dx": 0,
    "dy": 0,
    "ax": 137,
    "ay": 0,
    "xb": 2,
    "yb": 138,
    "w": 63,
    "h": -138
  },
  {
    "g": "seven.tosf",
    "cl": 1,
    "dx": 0,
    "dy": 0,
    "ax": 137,
    "ay": 0,
    "xb": 2,
    "yb": 136,
    "w": 118,
    "h": -181
  }
]

Example 5: with glyph IDs, glyph positions, glyph extents and cluster IDs

This shows non-1:1 clusters, probably best notation to adopt universally.

hb-shape --font-file="CormorantGaramond-Regular.otf" --features="+liga" \
  --text="office" --font-size=256 --output-format=json \
  --show-extents --no-glyph-names

[
  {
    "g": 640,
    "cl": 0,
    "dx": 0,
    "dy": 0,
    "ax": 123,
    "ay": 0,
    "xb": 9,
    "yb": 102,
    "w": 104,
    "h": -105
  },
  {
    "g": 937,
    "cl": 1,
    "dx": 0,
    "dy": 0,
    "ax": 221,
    "ay": 0,
    "xb": 6,
    "yb": 186,
    "w": 207,
    "h": -186
  },
  {
    "g": 549,
    "cl": 4,
    "dx": 0,
    "dy": 0,
    "ax": 103,
    "ay": 0,
    "xb": 9,
    "yb": 101,
    "w": 89,
    "h": -104
  },
  {
    "g": 561,
    "cl": 5,
    "dx": 0,
    "dy": 0,
    "ax": 104,
    "ay": 0,
    "xb": 9,
    "yb": 101,
    "w": 85,
    "h": -104
  }
]

robmck-ms commented 8 years ago

@PeterCon

Answer 1 · 2017-04-06T01:10:46.000Z

Ps. Of course, perhaps there might be a need that this syntax is tweaked, but it should remain simple. Let’s discuss, and then get as many people to implement it as possible.

Answer 2 · 2017-04-06T01:31:08.000Z

As long as there is a JSON Schema to go along with that format, I wouldn't mind being able to rely on shapers being able to output a universally readable data format!

Answer 3 · 2017-04-06T04:17:55.000Z

I like this idea. Would be awesome to use this format for parts of https://github.com/unicode-org/text-rendering-tests as well.

Answer 4 · 2017-04-14T23:27:38.000Z

I'm obviously in favor. Just want to point out that Ned didn't particularly like this format because CoreText does not produce separate advance width per glyph. Would be good to get his feedback.

Answer 5 · 2017-04-15T00:28:15.000Z

@behdad Due to limitations in CoreText, or by choice? Given that per-glyph advance width is totally a thing in OpenType (especially CFF/CFF2 flavoured OpenType) it seems odd to deliberately "smooth that over".

Answer 6 · 2017-04-15T00:34:22.000Z

@behdad Due to limitations in CoreText, or by choice?
Their API definitely doesn't expose it. And they somehow don't subscribe to it.

Given that per-glyph advance width is totally a thing in OpenType (especially CFF/CFF2 flavoured OpenType) it seems odd to deliberately "smooth that over".

They come from AAT background ;).

In OpenType, it's actually not specified how advance widths are modified in, eg, cursive connections.