YerevaNN/translit-rnn

Add new language

Opened this issue · 9 comments

How add romanized Arabic and Arabic

Tips are written in a readme and blogpost.

We have created a probabilistic mapping, so that each Armenian letter is romanized according to the given probabilities. For example, ծ is replaced by ts in 60% of cases, c in 30% of cases, and & in 10% of cases. The full set of rules are here and can be browsed here.

how i create a probabilistic mapping for romanized Arabic, i have Comparison table like this خ is kh/7'/5

If I get everything right, You have to have this line in your transliteration.json

"خ": {"kh": 0.5, "7\'": 0.3, "5": 0.2},
             ^           ^         ^           

and probabilities written in this places. You can see it here. I have beautified json there.

thsnks
i created transliteration.json file and train the network
but it take very long time more than two hours and not stop yet

log file :

Loading Files
Building Network ...
Compiling Functions ...
Computing Updates ...
WARNING (theano.configdefaults): install mkl with conda install mkl-service: No module named mkl
Training ...

@essamgoda do you use GPU for training?

@Hrant-Khachatrian
when i use this

python -u train.py --hdim 1024 --depth 2 --batch_size 200 --seq_len 30 --language hy-AM &> log.txt

my laptop frozen when it start training in lo file

so i used this

python -u train.py --hdim 512 --depth 1 --batch_size 50 --seq_len 10 --language hy-AM &> log.txt

this work but when run test by this

python -u test.py --hdim 512 --depth 1 --model {MODEL} --language hy-AM

output is


Loading Files
Building network ...
Compiling Functions ...
Traceback (most recent call last):
  File "test.py", line 140, in <module>
    main()
  File "test.py", line 127, in main
    f = np.load(args.model)
  File "/home/essam/.local/lib/python2.7/site-packages/numpy/lib/npyio.py", line 370, in load
    fid = open(file, "rb")
IOError: [Errno 2] No such file or directory: '{MODEL}'

@essamgoda you need to specify the model you want to test. In place of {MODEL} write the path to saved model, it should be 'languages/hy-AM/models/...'.

@TigranGalstyan okay thanks its work but when test on specific file

python -u test.py --hdim 512 --depth 1 --model 'languages/hy-AM/models/model.hdim512.depth1.seq_len10.bs50.epoch10.0043668122.loss1.71054670304.npz.npy' --language hy-AM --translit_path 't.txt'

show

Loading Files
Building network ...
Compiling Functions ...
Testing ...
0.0% done 

he log after training is
last lines

skipped 0
computing validation loss...
validation loss is 2.94090302211
saving to -> languages/hy-AM/models/model.hdim256.depth2.seq_len30.bs100.epoch10.0363901019.loss2.72517724795.npz

when i test with this command

python -u test.py --hdim 256 --depth 2 --model '/media/ess
am/New Volume/translitration v1/LSTM/translit-rnn-master/languages/hy-AM/models/model.hdim256.depth2.seq_len30.bs100.epoch10.0363901019.loss2.72517724795.npz.npy' --language hy-AM

result is

Loading Files
Building network ...
Compiling Functions ...
Testing ...
Computing editdistance and writing to -> languages/hy-AM/results.model.hdim256.depth2.seq_len30.bs100.epoch10.0363901019.loss2.72517724795.npz.npy

when use

python -u test.py --hdim 256 --depth 2 --model '/media/ess
am/New Volume/translitration v1/LSTM/translit-rnn-master/languages/hy-AM/models/model.hdim256.depth2.seq_len30.bs100.epoch10.0363901019.loss2.72517724795.npz.npy' --language hy-AM --translit_path 't.txt'

result is

Loading Files
Building network ...
Compiling Functions ...
Testing ...
0.0% done

@TigranGalstyan @Hrant-Khachatrian