/indic_nlp_library

Resources and tools for Indian language Natural Language Processing

Primary LanguagePythonMIT LicenseMIT

Indic NLP Library

This repository is a de-bloated fork of the original Indic NLP Library and integrates UrduHack submodule and Indic NLP Resources directly. This allows to work with Urdu normalization and tokenization without needing to install urduhack and indic_nlp_resources separately, which can be an issue sometimes as it is TensorFlow based. This repository is mainly created and mainted for IndicTrans2 and IndicTransTokenizer

For any queries, please get in touch with the original authors/maintainers of the respective libraries:

Usage:

git clone https://github.com/VarunGumma/indic_nlp_library.git

cd indic_nlp_library
pip install --editable ./

Updates:

  • Integrated urduhack directly into the repository.
  • Renamed master branch as main.
  • Integrated indic_nlp_resources directly into the repository.
  • De-bloated the repository.