UnReaL-TecE LLP
We research, build, invent and aim to realise what may seem like Mission Impossible - language technologies for ALL (and not just 22 or 100 ) languages of India
India
Pinned Repositories
ComMA
Dataset of 20,000 datapoints in Meitei, Bangla and Hindi, richly annotated with different levels of aggression and bias for the ComMA Project.
crawlers
Crawlers for automatically collecting data from different sources
harmpot
This repository contains the dataset, models and other details about the HarmPot (Measuring Harm Potential of Social Media Content in India) Project.
hindi-politeness
A repository of the social media dataset in Hindi, annotated with politeness levels
life
Linguistic Field Data Management and Analysis System [LiFE]
mscrabble
Repository for Multilingual Scrabble Generator and Games - especially aimed towards endangered languages
propaganda
Repository of the data and tools for propaganda identification in HIndi
SpeeD-IA
Repository for different Speech Datasets and Models for Indo-Aryan languages.
SpeeD-IL
Central Repository for the Speech Datasets and Models in Indian Languages (SpeeD-IL) project. Each language family has a separate, dedicated repository linked to this central repository.
TextGridTools
Read, write, and manipulate Praat TextGrid files with Python
UnReaL-TecE LLP's Repositories
unrealtecellp/life
Linguistic Field Data Management and Analysis System [LiFE]
unrealtecellp/mscrabble
Repository for Multilingual Scrabble Generator and Games - especially aimed towards endangered languages
unrealtecellp/SpeeD-IL
Central Repository for the Speech Datasets and Models in Indian Languages (SpeeD-IL) project. Each language family has a separate, dedicated repository linked to this central repository.
unrealtecellp/ComMA
Dataset of 20,000 datapoints in Meitei, Bangla and Hindi, richly annotated with different levels of aggression and bias for the ComMA Project.
unrealtecellp/crawlers
Crawlers for automatically collecting data from different sources
unrealtecellp/harmpot
This repository contains the dataset, models and other details about the HarmPot (Measuring Harm Potential of Social Media Content in India) Project.
unrealtecellp/hindi-politeness
A repository of the social media dataset in Hindi, annotated with politeness levels
unrealtecellp/propaganda
Repository of the data and tools for propaganda identification in HIndi
unrealtecellp/speech-aggression
Repository of data and scripts of UGC-UKIERI Project on "Automatic Detection of Verbal Threat in HIndi and English Aggressive Speech"
unrealtecellp/SpeeD-IA
Repository for different Speech Datasets and Models for Indo-Aryan languages.
unrealtecellp/TextGridTools
Read, write, and manipulate Praat TextGrid files with Python