harfbuzz/uharfbuzz

Use harfbuzz's mmap

Closed this issue · 6 comments

uharfbuzz currently suggests reading fonts like,

with open(sys.argv[1], 'rb') as fontfile:
    fontdata = fontfile.read()

text = sys.argv[2]

face = hb.Face(fontdata)
font = hb.Font(face)
upem = face.upem

but that's sub-optimal, one can enable HAVE_MMAP https://github.com/harfbuzz/harfbuzz/blob/master/src/hb-blob.cc#L478-L485 for non-Windows platforms (Windows build config is already fine) and use hb_blob_create_from_file instead so it won't read whole font at once, I wonder however is uharfbuzz used in places that lots of fonts are loaded (I guess?) thus worth the font loading optimization? This needs tweaking the python API also of course.

This needs tweaking the python API also of course.

Why?

It only accepts bytes https://github.com/harfbuzz/uharfbuzz/blob/985eda0/src/uharfbuzz/_harfbuzz.pyx#L295 but should accept path also? Also maybe someone passes path as py3's bytes so guess another API is needed.

I just wanted to know whether it is a worth to do thing anyway so I can have a look.

I just wanted to know whether it is a worth to thing anyway so I can have a look.

To me it's definitely easy enough to be worth it. I wanted the same for fonttools back in 2016:

fonttools/fonttools#581

I still think it should be done and my last assessment there is still accurate.

There are some few other unfortunate API choices that can't be fixed in a compatible way (short of introducing new API), so now might be good time to review the API, drop Python 2 support and bump the major version.

Yeah, we can use the chance of dropping py2 support for making backward incompatible changes as needed.