code-mixing
There are 22 repositories under code-mixing topic.
gentaiscool/code-switching-papers
A curated list of research papers and resources on code-switching
microsoft/CodeMixed-Text-Generator
This tool helps automatic generation of grammatically valid synthetic Code-mixed data by utilizing linguistic theories such as Equivalence Constant Theory and Matrix Language Theory.
microsoft/LID-tool
This code provides word level language identification tool for identifying language for individual words in Code-Mixed text. e.g. The text that includes words from two languages such as Hindi written in roman script, mixed with English.
praatibhsurana/Hinglish_Hindi_WSD
A pipeline for transliteration, spell correction, POS tagging and word sense disambiguation of Hinglish code mixed data to Hindi Devanagari script.
salesforce/adversarial-polyglots
Code for the paper "Code-Mixing on Sesame Street: Dawn of the Adversarial Polyglots" (NAACL-HLT 2021)
aparnadutta/code-mixed-lid
Word-level language identification for Bangla-English code-mixed social media data, using a BiLSTM with subword embeddings.
ash-shar/Code-Switching-and-Swearing-Patterns-on-Twitter
Repository containing Abusive Tweet Detection, Location Detection and Gender Detection codes
LCS2-IIITD/HIT-ACL2021-Codemixed-Representation
This repo contains the source code of HIT: A Hierarchically Fused Deep Attention Network for RobustCode-mixed Language Representation (Accepted in ACL 2021)
mmaguero/josa-corpus
Jopara (Guarani-dominant mixed with Spanish) sentiment analysis corpus
gulabpatel/Code-Mixing
will discuss code mixing algorithms evolution
ir-nlp-csui/id-en-code-mixed
Indonesian-English code-mixed Twitter dataset
ayanc18/PsycholinguisticCodeMixing
Psycholinguistic Analysis of Code Mixing - Speech and Natural Language Processing Term Project: CS60057. Department of Computer science and Engineering, Indian Institute of Technology Kharagpur
Lidan0241/language-detection
language detection in code-switching for es/en/zh speakers
Bernardbyy/BahasaRojakSentimentAnalysis
Handling Bahasa Rojak (Malaysian Code Mixing Language) OOV and performing Sentiment Analysis using downstreamed XLM-R
carexl8/code-mixed-tweets
Tweet ids for code-mixed Russian-German and Russian-Hebrew tweets
MuhammedFahd/Depression-Detection-in-Singlish-text
This is a depression detection system that detects depression in Sinhala-English code-mixed text content which are published by different users on social media. The frontend of the system was developed using Bootstrap, HTML, and Jquery and the backend of the system was developed using Flask
Anwarvic/truel_bilingual_nmt
The official code for the "True Bilingual NMT" paper
Nexdata-AI/300-Person-Mandarin-Chinese-and-English-Bilingual-Spontaneous-Monologue-smartphone
300-Person-Mandarin-Chinese-and-English-Bilingual-Spontaneous-Monologue-smartphone
vcyrot/Frenglish-Benchmark
A Centralized Frenglish Benchmark from Naturally Occurring Code-Switching and Code-Mixing