moi15moi/FontCollector

[Feature] Delete glyph that are NOT been used in the .ass file

Opened this issue · 4 comments

fontTools allow to create a subset of the font like this:

from fontTools import subset
from fontTools.ttLib.ttFont import TTFont

font = TTFont("font.TTF")

subsetter = subset.Subsetter()
subsetter.populate(text="ABÉé")
subsetter.subset(font)

font.save("font strip.TTF")
McBaws commented

If this was implemented, it would be nice to have the option to either

  1. only keep glyphs that are used in the subs
  2. keep a standard set of glyphs as well as those used in the subs

The standard glyphs could be something like unicode=0000-00FF,0370-04FF,2000-206F,20A0-20CF.
I would also recommend changing the font name when stripped, so that the modified version of the font isn't confused with the original, full version. Appending -Str to the name should be enough.

motbob has a really good blog post about this on nyaa release 1593330, he used this script to strip the fonts.

  1. only keep glyphs that are used in the subs

This is what I wanted to implement.

  1. keep a standard set of glyphs as well as those used in the subs

I don't understand why someone would need that. If you intend to strip a font, why would you want to retain unused glyphs?

I would also recommend changing the font name when stripped, so that the modified version of the font isn't confused with the original, full version. Appending -Str to the name should be enough.

If I do that, this means I also need to change the font name in the style and in the \fn tag. I am not sure if I want to do that. It add some complexity. Note to myself: ass_tag_analyzer doesn't support VSFilterMod tag, so they will be automatically trimmed or even renamed. Ex: \fsvp20 would be consider like \fs0 because VSFilter and libass do that.

motbob has a really good blog post about this on nyaa release 1593330, he used this script to strip the fonts.

From what I can see, it miss a lot of detail like --legacy-cmap, --symbol-cmap, --name-legacy, --recommended-glyphs, --font-number. I may need more of them. I still need to check properly.

McBaws commented

I don't understand why someone would need that

It makes the editing of subtitles and distribution of fonts much easier. If a subtitle file didn't use the letter 'x' a single time, then had its fonts collected with this script, its glyph would be missing from the font. If somebody later tried to edit the subtitle and add the letter 'x' somewhere, the glyph would be missing, and they would have to find the original font, which may be very hard.

A lot of fansubbers get their fonts from attachments on releases. If all of the standard glyphs were stripped from fonts that only use a couple glyphs, that would make things more difficult for everyone, as it would be harder to find useful fonts.

I am not sure if I want to do that. It add some complexity.

If someone takes a stripped font meant for a certain release and tries to mux it in a release that uses the missing glyphs, that could lead to incredibly hard to fix mistakes. I understand it would be technically complex, but I still think it's very important

From what I can see, it miss a lot of detail

I just wanted to include an example of a script that did similar things :)

https://github.com/wyzdwdz/assfonts does this, although I don't think it supports keeping a standard set of glyphs.