harfbuzz/harfbuzz

[subset] `head.indexToLocFormat=0` causes decompressing error

Closed this issue · 4 comments

hb-subset Loongtype.ttf -o subset.ttf --text="0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ鸿福酱酒送礼套装香气醇厚佳品自带具质保证躯" --name-IDs='*' --name-languages='*' --layout-features='*'

When the word is removed from --text, head.indexToLocFormat becomes 0, otherwise it is 1.

When subsetted fonts are compressed into woff2, this causes the woff2-rs library to report an error when decompressing.

See: woff2-rs PR and src/glyf_decoder/mod.rs:367

thread 'decode::tests::test_glyf' panicked at src/glyf_decoder/mod.rs:378:80:
called `Result::unwrap()` on an `Err` value: TryFromIntError(())
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Also, I would like to know how indexToLocFormat is calculated when subsetting?

The same thing happens with pyftsubset. cc @anthrotype

looks like a bug in woff2-rs, nothing that harfbuzz can do. The head.indexToLocFormat specifies whether loca table uses short (Offset16) vs long (Offset32) offsets, usually one wants to choose the most compact serialization.

I would like to know how indexToLocFormat is calculated when subsetting?

i'm not sure where exactly hb-subset computes that, but fonttools does it here:

https://github.com/fonttools/fonttools/blob/0572f7871823bdef3ceceaf41dedd0a6bd100995/Lib/fontTools/ttLib/tables/_l_o_c_a.py#L42-L49

Closing as this isn't a defect in harfbuzz.