iNLTK aims to provide out of the box support for various NLP tasks that an application developer might need for Indic languages.
Checkout detailed docs along with Installation instructions at https://inltk.readthedocs.io
Language | Code |
---|---|
Hindi | hi |
Punjabi | pa |
Sanskrit | sa |
Gujarati | gu |
Kannada | kn |
Malayalam | ml |
Nepali | ne |
Odia | or |
Marathi | mr |
Bengali | bn |
Tamil | ta |
Urdu | ur |
English | en |
Note: English model has been directly taken from fast.ai
If you would like to add support for language of your own choice to iNLTK, please start with checking/raising a issue here
Please checkout the steps I'd mentioned here for Telugu to begin with. They should be almost similar for other languages as well.
If you would like to take iNLTK's models and refine them with your own dataset or build your own custom models on top of it, please check out the repositories in the above table for the language of your choice. The repositories above contain links to datasets, pretrained models, classifiers and all of the code for that.
If you wish for a particular functionality in iNLTK - Start by checking/raising a issue here
Shout out if you want to help :)
Shout out if you want to lead :)
- Add NER support for all languages
- Add Textual Entailment support for all languages
- Work on a unified model for all the languages
- POS support in iNLTK
- Add translations - to and from languages in iNLTK + English
- By Jeremy Howard on Twitter
- By Sebastian Ruder on Twitter
- By Vincent Boucher on LinkedIn
- By Kanimozhi, By Soham, By Imaad on LinkedIn
- iNLTK was trending on GitHub in May 2019