Developing a POS tagger for Hindi Copora
- Bi-LSTM Architecture
- Bi-LSTM + CRF Architecture
- Residual LSTM + EMLO (Testing)
extract_tags.sh
andextract_data
are used to extract data from the Hindi Corpora- Currently Hindi word embeddings trained on Fasttext are used.
train.py
files contains the implementation of the above architectures.
Requirements:
- Python 3.6
- Keras 2.2.0 - For the creation of BiLSTM-CRF architecture
- Tensorflow 1.8.0 - As backend for Keras (other backends are untested.