code-mixing

There are 31 repositories under code-mixing topic.

gentaiscool/code-switching-papers
A curated list of research papers and resources on code-switching
324 24 640
microsoft/CodeMixed-Text-Generator
This tool helps automatic generation of grammatically valid synthetic Code-mixed data by utilizing linguistic theories such as Equivalence Constant Theory and Matrix Language Theory.
Language:Jupyter Notebook56 7 612
microsoft/LID-tool
This code provides word level language identification tool for identifying language for individual words in Code-Mixed text. e.g. The text that includes words from two languages such as Hindi written in roman script, mixed with English.
Language:Python55 8 210
praatibhsurana/Hinglish_Hindi_WSD
A pipeline for transliteration, spell correction, POS tagging and word sense disambiguation of Hinglish code mixed data to Hindi Devanagari script.
Language:Python35 4 18
sumanbanerjee1/Code-Mixed-Dialog
Language:Python33 5 07
aparnadutta/code-mixed-lid
Word-level language identification for Bangla-English code-mixed social media data, using a BiLSTM with subword embeddings.
Language:Python10 2 01
cisnlp/MaskLID
💬 MaskLID: Code-Switching Language Identification through Iterative Masking -- ACL 2024
Language:Python10 8 02
salesforce/adversarial-polyglots
Code for the paper "Code-Mixing on Sesame Street: Dawn of the Adversarial Polyglots" (NAACL-HLT 2021)
Language:Python10 4 07
ash-shar/Code-Switching-and-Swearing-Patterns-on-Twitter
Repository containing Abusive Tweet Detection, Location Detection and Gender Detection codes
Language:Python7 3 02
LCS2-IIITD/HIT-ACL2021-Codemixed-Representation
This repo contains the source code of HIT: A Hierarchically Fused Deep Attention Network for RobustCode-mixed Language Representation (Accepted in ACL 2021)
Language:Python6 3 05
mmaguero/josa-corpus
Jopara (Guarani-dominant mixed with Spanish) sentiment analysis corpus
6 1 00
andrianllmm/tagLID
A word-level Language Identification (LID) tool for Tagalog-English (Taglish) text
Language:Python2 1 00
gulabpatel/Code-Mixing
will discuss code mixing algorithms evolution
Language:Jupyter Notebook2 1 0
ir-nlp-csui/id-en-code-mixed
Indonesian-English code-mixed Twitter dataset
2 0 00
ayanc18/PsycholinguisticCodeMixing
Psycholinguistic Analysis of Code Mixing - Speech and Natural Language Processing Term Project: CS60057. Department of Computer science and Engineering, Indian Institute of Technology Kharagpur
Language:Python1 4 01
Lidan0241/language-detection
A language detection model for code-switched texts in es/en/zh
Language:Jupyter Notebook1 1 01
poornagurram/code_mixing_sentiment
Language:Python1 0 01
Wei-RongRong2/RojakLanguageSentimentAnalysis
This is a machine learning project focused on analysing and classifying sentiments in code-switched and code-mixed text, specifically targeting the unique linguistic characteristics found in Malaysian conversations.
Language:Jupyter Notebook1 1 00
Bernardbyy/BahasaRojakSentimentAnalysis
Handling Bahasa Rojak (Malaysian Code Mixing Language) OOV and performing Sentiment Analysis using downstreamed XLM-R
Language:Jupyter Notebook0 1 01
carexl8/code-mixed-tweets
Tweet ids for code-mixed Russian-German and Russian-Hebrew tweets
0 1 00
jessicasaikia/bidirectional-long-short-term-memory-BiLSTM
This repository implements a Bidirectional Long Short Term Memory (BiLSTM) for performing Parts-of-Speech (POS) Tagging on Assamese-English code-mixed texts.
Language:Python0 1 00
jessicasaikia/conditional-random-field-CRF
This repository implements a Conditional Random Field (CRF) for performing Parts-of-Speech (POS) Tagging on Assamese-English code-mixed texts.
Language:Python0 1 00
jessicasaikia/hidden-markov-model-HMM
This repository implements a Hidden Markov Model (HMM) for performing Parts of Speech (POS) Tagging on Assamese-English code-mixed texts.
Language:Python0 1 00
jessicasaikia/long-short-term-memory-LSTM
This repository implements a Long Short Term Memory (LSTM) for performing Parts-of-Speech (POS) Tagging on Assamese-English code-mixed texts.
Language:Python0 1 00
jessicasaikia/multilingual-BERT-mBERT
This repository implements a Multilingual BERT (mBERT) model for performing Parts-of-Speech (POS) Tagging on Assamese-English code-mixed texts.
Language:Python0 1 00
jessicasaikia/rule-based
This repository contains a simple Rule-Based Model for Parts-of-Speech tagging in Assamese-English code mixed texts.
Language:Python0 1 00
MuhammedFahd/Depression-Detection-in-Singlish-text
This is a depression detection system that detects depression in Sinhala-English code-mixed text content which are published by different users on social media. The frontend of the system was developed using Bootstrap, HTML, and Jquery and the backend of the system was developed using Flask
0 1 00
vcyrot/Frenglish-Benchmark
A Centralized Frenglish Benchmark from Naturally Occurring Code-Switching and Code-Mixing
0 1 00
Anwarvic/truel_bilingual_nmt
The official code for the "True Bilingual NMT" paper
Language:Python1 0
kmi-linguistics/Code-mixing
3 0
Nexdata-AI/300-Person-Mandarin-Chinese-and-English-Bilingual-Spontaneous-Monologue-smartphone
300-Person-Mandarin-Chinese-and-English-Bilingual-Spontaneous-Monologue-smartphone
1 0

code-mixing

gentaiscool/code-switching-papers

microsoft/CodeMixed-Text-Generator

microsoft/LID-tool

praatibhsurana/Hinglish_Hindi_WSD

sumanbanerjee1/Code-Mixed-Dialog

aparnadutta/code-mixed-lid

cisnlp/MaskLID

salesforce/adversarial-polyglots

ash-shar/Code-Switching-and-Swearing-Patterns-on-Twitter

LCS2-IIITD/HIT-ACL2021-Codemixed-Representation

mmaguero/josa-corpus

andrianllmm/tagLID

gulabpatel/Code-Mixing

ir-nlp-csui/id-en-code-mixed

ayanc18/PsycholinguisticCodeMixing

Lidan0241/language-detection

poornagurram/code_mixing_sentiment

Wei-RongRong2/RojakLanguageSentimentAnalysis

Bernardbyy/BahasaRojakSentimentAnalysis

carexl8/code-mixed-tweets

jessicasaikia/bidirectional-long-short-term-memory-BiLSTM

jessicasaikia/conditional-random-field-CRF

jessicasaikia/hidden-markov-model-HMM

jessicasaikia/long-short-term-memory-LSTM

jessicasaikia/multilingual-BERT-mBERT

jessicasaikia/rule-based

MuhammedFahd/Depression-Detection-in-Singlish-text

vcyrot/Frenglish-Benchmark

Anwarvic/truel_bilingual_nmt

kmi-linguistics/Code-mixing

Nexdata-AI/300-Person-Mandarin-Chinese-and-English-Bilingual-Spontaneous-Monologue-smartphone