Is this language specific?
Closed this issue · 1 comments
Is there any thing special thing about English or this can be trained and used for languages like Japanese, Arabic, Russian, Hebrew?
There is nothing language-specific in the code and you can try non-english words on the website but they tend to be stereotypical english usage of foreign words: https://www.thisworddoesnotexist.com/w/%E6%88%91%E4%B8%8D%E7%9F%A5%E9%81%93/eyJ3IjogIlx1NjIxMVx1NGUwZFx1NzdlNVx1OTA1MyIsICJkIjogImFuIGluc2NyaXB0aW9uIHRoYXQgcmVhZHMgdGhlIG5hbWUgb2YgdGhlIEVtcGVyb3IgSnViZWkuIiwgInAiOiAibm91biIsICJlIjogIlx1NjIxMVx1NGUwZFx1NzdlNVx1OTA1MyBcdTRlYmFcdTkwMDMifQ==._rWh_i9KElL0ww5HQJHyYYUcPi5pU64LxcfLbXTDiCo=
If you want to train your own model the only problem is that the base model (GPT-2) is primarily trained with English. If you replace that with something suited for other language then you would be good!