Pinned Repositories
datasets
🤗 The largest hub of ready-to-use NLP datasets for ML models with fast, easy-to-use and efficient data manipulation tools
deepcut
A Thai word tokenization library using Deep Neural Network
dialogues
This codebase provides a unified interface to several dialogue datasets
en-hi-codemixed-corpus
Repository for the English-Hindi Codemixed to Monolingual English Parallel Corpus
greenlight
A really simple end-user interface for your BigBlueButton server.
HashSet
Hinglish-TOP-Dataset
Consists of the largest (10K) human annotated code-switched semantic parsing dataset & 170K generated utterance using the CST5 augmentation technique. Queries are derived from TOPv2, a multi-domain task oriented semantic parsing dataset. Tests suggest that with CST5, up to 20x less labeled data can achieve the same semantic parsing performance.
hinglishNorm
A Hindi-English Dataset for Text Normalization
NLP-toolkit-as-a-service
CSE 461 Software Engineering Spring 2020 Course Project
PreCogIIITH-HinglishEval-INLG-2022
prashantkodali's Repositories
prashantkodali/HashSet
prashantkodali/NLP-toolkit-as-a-service
CSE 461 Software Engineering Spring 2020 Course Project
prashantkodali/PreCogIIITH-HinglishEval-INLG-2022
prashantkodali/datasets
🤗 The largest hub of ready-to-use NLP datasets for ML models with fast, easy-to-use and efficient data manipulation tools
prashantkodali/deepcut
A Thai word tokenization library using Deep Neural Network
prashantkodali/dialogues
This codebase provides a unified interface to several dialogue datasets
prashantkodali/en-hi-codemixed-corpus
Repository for the English-Hindi Codemixed to Monolingual English Parallel Corpus
prashantkodali/greenlight
A really simple end-user interface for your BigBlueButton server.
prashantkodali/Hinglish-TOP-Dataset
Consists of the largest (10K) human annotated code-switched semantic parsing dataset & 170K generated utterance using the CST5 augmentation technique. Queries are derived from TOPv2, a multi-domain task oriented semantic parsing dataset. Tests suggest that with CST5, up to 20x less labeled data can achieve the same semantic parsing performance.
prashantkodali/hinglishNorm
A Hindi-English Dataset for Text Normalization
prashantkodali/humor-detection-corpus
Humor Detection in English-Hindi Code-Mixed Social Media Content
prashantkodali/magnitude
A fast, efficient universal vector embedding utility package.
prashantkodali/Named-Entity-Recognition
Corpus and a baseline neural network system for Named Entity Recognition in Hindi-English Code-Mixed social media text.
prashantkodali/NFHS-5
NFHS-5: National Family Health Survey (2019-20). CSV fact sheets for key indicators from http://rchiips.org/nfhs/
prashantkodali/prashantkodali
Config files for my GitHub profile.
prashantkodali/prashantkodali.github.io
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
prashantkodali/python-twitter-examples
Examples of using Python for Twitter social data mining, using the python-twitter-tools framework.
prashantkodali/Sarcasm_Detection
prashantkodali/shallowparser
Shallow Parser CMST
prashantkodali/st-annotated-text
A simple component to display annotated text in Streamlit apps.
prashantkodali/Word-Level-Language-Identification-in-English-Telugu-Code-Mixed-Data
prashantkodali/wordcloud--roman--devnagri-mix
Generating wordcloud that has mixture of words written in Roman and Devnagri script.