dialect-identification
There are 26 repositories under dialect-identification topic.
CAMeL-Lab/camel_tools
A suite of Arabic natural language processing tools developed by the CAMeL Lab at New York University Abu Dhabi.
instadeepai/tunbert
TunBERT is the first release of a pre-trained BERT model for the Tunisian dialect using a Tunisian Common-Crawl-based dataset. TunBERT was applied to three NLP downstream tasks: Sentiment Analysis (SA), Tunisian Dialect Identification (TDI) and Reading Comprehension Question-Answering (RCQA)
iabufarha/ArSarcasm
This repository contains the Arabic sarcasm dataset (ArSarcasm)
swshon/dialectID_siam
Dialect identification using Siamese network
qcri/Arabic_speech_code_switching
The first Dialectal Arabic Code Switching - DACS corpus from broadcast speech. Annotated at the token-level, considering both the linguistic and the acoustic cues. This dataset is a potential benchmark for DCS in spontaneous speech.
iabufarha/ArSarcasm-v2
ArSarcasm-v2 is an extension to the original ArSarcasm dataset. It was used for the shared task on sarcasm detection and sentiment analysis, which is a part of WANLP 2021.
sinaahmadi/CORDI
Language and Speech Technology for Central Kurdish Varieties (LREC-COLING 2024)
hb20007/greek-dialect-classifier
Classifier that identifies Greek text as Cypriot Greek or Standard Modern Greek
a-coles/SMS-Stylometry
A tool that predicts the dialect of English of an SMS message using recurrent neural networks supplemented with data from Google Trends.
AlexYangLi/DMT
VarDial19 shared task: Discriminating between Mainland and Taiwan Variation of Mandarin Chinese (DMT)
abdelrahman-wael/Arabic-Dialect-Classification-Nadi-Shared-Task
using AraBert to classify different Arabic dialects. ranked fourth in WANLP2020 workshop.
kscanne/canuint
Ríomhchlár a dhéanann aicmiú staitistiúil ar théacsanna Gaeilge de réir a gcanúint
MohamedSebaie/Arabic_Dialect_Identification_NLP-AIM-Task
Arabic_Dialect_Identification_NLP-AIM-Task
telsahy/capstone-35
Twitter Dialect Datasets and Classifiers (GULF Arabic Corpus)
telsahy/capstone-52
Twitter Dialect Datasets and Classifiers (EG + GULF Arabic Corpus)
disooqi/farspeech-website
Web interface for far-speech demo to be present in INTERSPEECH 2019
disooqi/MADAR-shared-task
This shared task will be the first to target a large set of dialect labels at the city and country levels. The data for the shared task is created or collected under the Multi-Arabic Dialect Applications and Resources (MADAR) project.
eesanoble/Arabic-Dialect-Classifier
An Arabic Tweet Dialect Classifier
telsahy/capstone-34
Twitter Dialect Datasets and Classifiers (EG Arabic Corpus)
giacomocamposampiero/italian-dialects-identification
ITDI shared task @ VarDial2022 9th Workshop on NLP for Similar Languages, Varieties and Dialects.
GLaDO8/IViE_corpus_british_dialects_classification
log MFSC based classification of British English dialects from the IViE(Intonational Variation in English) corpus dataset
Salma-Jamal/Arabic-Dialect-Identification
Arabic Dialect Identification on NADI 2020 and QADI datasets
30stomercury/IS19_ComParE_Sub-Challenge
[Interspeech19] Computational Paralinguistics ChallengE (ComParE)
motazsaad/ArbDialectID
Arabic Dialects Identification