Does it accept Arabic (or any non-ASCII) in general?
Opened this issue · 0 comments
ameer-kanaan commented
I'm facing difficulty in vectorizing an Arabic text, I don't seem to be able of getting anything useful.
The word2vec function is only extracting funny characters (like emojis and so on) from a text file of about 200k Arabic words.. it seems also to convert these characters to codepoint values.
I would like to have nice an normal looking word2vec for my Arabic text.
Any comments or workarounds?