language-identification
There are 153 repositories under language-identification topic.
googlesamples/mlkit
A collection of sample apps to demonstrate how to use Google's ML Kit APIs on Android and iOS
modelscope/3D-Speaker
A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization
pemistahl/lingua-py
The most accurate natural language detection library for Python, suitable for short text and mixed-language text
pemistahl/lingua-go
The most accurate natural language detection library for Go, suitable for short text and mixed-language text
pemistahl/lingua-rs
The most accurate natural language detection library for Rust, suitable for short text and mixed-language text
pemistahl/lingua
The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike
echogarden-project/echogarden
Cross-platform speech toolset, used from the command-line or as a Node.js library. Includes a variety of engines for speech synthesis, speech recognition, forced alignment, speech translation, voice isolation, language detection and more.
apcode/tensorflow_fasttext
Simple embedding based text classifier inspired by fastText, implemented in tensorflow
textpipe/textpipe
Textpipe: clean and extract metadata from text
LlmKira/fast-langdetect
⚡️ 80x faster Fasttext language detection out of the box | Split text by language
vunb/vntk
Vietnamese NLP Toolkit for Node
adbar/simplemma
Simple multilingual lemmatizer for Python, especially useful for speed and efficiency
cisnlp/GlotLID
💬 Language Identification with Support for More Than 2000 Labels -- EMNLP 2023
HPI-DeepLearning/crnn-lid
Code for the paper Language Identification Using Deep Convolutional Recurrent Neural Networks
KrishnaDN/x-vector-pytorch
Implementation of the paper "Spoken Language Recognition using X-vectors" in Pytorch
SpeechFlow-io/Spoken_language_identification
A TensorFlow-based spoken language identification
nitotm/efficient-language-detector-js
Fast and accurate natural language detection. Detector written in Javascript. Nito-ELD, ELD.
DoodleBears/split-lang
✨ Split text by languages (e.g. 你喜欢看アニメ吗 -> 你喜欢看 | アニメ | 吗) for NLP tasks (e.g. parse, TTS). Powered by fasttext and budoux
adbar/py3langid
Faster, modernized fork of the language identification tool langid.py
microsoft/LID-tool
This code provides word level language identification tool for identifying language for individual words in Code-Mixed text. e.g. The text that includes words from two languages such as Hindi written in roman script, mixed with English.
swshon/dialectID_e2e
End to End Dialect Identification using Convolutional Neural Network
nitotm/efficient-language-detector
Fast and accurate natural language detection. Detector written in PHP. Nito-ELD, ELD.
py-lidbox/lidbox
End-to-end spoken language identification out of the box.
currentslab/fastlangid
fastlangid, the only language identification package that support cantonese (zh-yue), simplified (zh-hans) and traditional chinese (zh-hant)
zkmkarlsruhe/language-identification
Spoken Language Identification on Common Voice and AudioSet using Deep Learning
rosette-api/python
Babel Street Analytics Client Library for Python
mbanon/fastspell
Targetted language identifier, based on FastText and Hunspell.
sagorbrur/codeswitch
CodeSwitch is a NLP tool, can use for language identification, pos tagging, name entity recognition, sentiment analysis of code mixed data.
UBC-NLP/afrolid
AfroLID, a powerful neural toolkit for African languages identification which covers 517 African languages.
hiredscorelabs/seqtolang
Multi-Langauge Identification
nipunmanral/Spoken-Language-Identification
Implement a GRU/LSTM model using Keras, and train it to classify the languages using MFCC features
smola/language-dataset
Dataset for programming language identification.
dataiku/dss-plugin-nlp-preparation
Dataiku DSS plugin to detect languages, correct misspellings, and clean text data 🧼
floydhub/language-identification-template
Detect the languages from short pieces of text
cisnlp/GlotCC
🕸 GlotCC Dataset and Pipline -- NeurIPS 2024