Fork from https://github.com/Kyubyong/g2p deep learning seq2seq framework based on TensorFlow.
Notebook : https://colab.research.google.com/drive/1eVFh3336NEDimQPGcrhOul-fmA22UojM
- python 2.x or 3.x
- numpy >= 1.13.1
- tensorflow >= 1.3.0
- nltk >= 3.2.4
- python -m nltk.downloader "averaged_perceptron_tagger" "cmudict"
- inflect >= 0.3.1
- Distance >= 0.1.3
- future >= 0.16.0
- PyThaiNLP
python setup.py install
nltk package will be automatically downloaded at your first run.
python train.py
from g2p_th import g2p
text = "เราเดินเล่น"
print(g2p(text))
>>>['r', 'a', 'w^', '0', ' ', 'r', 'ee', 'z^', '0', ' ', 'r', 'ee', 'z^', '0']
If you need to convert lots of texts, you can use the global tf session.
import g2p_th as g2p
with g2p.Session():
phs = [g2p.g2p(text) for text in texts]