kmi-linguistics/vardial2018

This repository contains the dataset used for Indo-Aryan Language identitifcation Shared Task as part of the Evaluation Campaign in the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial) at COLING 2018. It has 15k sentences each in Awadhi, Bhojpuri, Braj, Magahi and Hindi

Apache-2.0

Watchers

jhcloos
riteshkrjnu
eemailme