prasunroy/stefann

Arabic & Hebrew

Opened this issue · 4 comments

@mauromazzei @prasunroy
What about languages like Arabic & Hebrew?

The method we proposed in the paper is language invariant. So it should work on languages other than English if trained with sufficient data. In our work, we collected 1315 fonts from Google Fonts. But at present, there are only 20 Arabic and 17 Hebrew fonts available on Google Fonts. So you might have to collect more data from other sources.

What about the fact that Arabic and Hebrew writes from Right-To-Left?
Will I follow the same training steps for English? or are there things that need modification?

As the generation is performed at character-level, so it should not be a problem. The steps will be same but may need some minor modifications for filename specifications. By default, the filenames are ASCII equivalent of the corresponding characters, e.g. 65.jpg refers to A etc.

A messy but easy solution is to replace English character images by new character images. A more robust but involved solution is to index the new characters using a lookup table (dictionary) and modify filenames accordingly, e.g. 1000.jpg refers to first character etc.

ASCII not Unicode/UTF-8, hmm......