dadelani's Stars
facebookresearch/seamless_communication
Foundational Models for State-of-the-Art Speech and Text Translation
xenova/transformers.js
State-of-the-art Machine Learning for the web. Run 🤗 Transformers directly in your browser, with no need for a server!
huggingface/optimum
🚀 Accelerate training and inference of 🤗 Transformers and 🤗 Diffusers with easy to use hardware optimization tools
THUDM/P-tuning-v2
An optimized deep prompt tuning strategy comparable to fine-tuning across scales and tasks
bazingagin/npc_gzip
Code for Paper: “Low-Resource” Text Classification: A Parameter-Free Classification Method with Compressors
epfLLM/meditron
Meditron is a suite of open-source medical Large Language Models (LLMs).
pbelcak/UltraFastBERT
The repository for the code of the UltraFastBERT paper
Unbabel/COMET
A Neural Framework for MT Evaluation
facebookresearch/belebele
Repo for the Belebele dataset, a massively multilingual reading comprehension dataset.
achernodub/targer
BiLSTM-CNN-CRF tagger
LasseRegin/medical-question-answer-data
Medical question and answer dataset gathered from the web.
ylacombe/finetune-hf-vits
Finetune VITS and MMS using HuggingFace's tools
google-research/mt-metrics-eval
Tools for evaluating the performance of MT metrics on data from recent WMT metrics shared tasks.
FreedomIntelligence/MultilingualSIFT
MultilingualSIFT: Multilingual Supervised Instruction Fine-tuning
MarsPanther/Amharic-English-Machine-Translation-Corpus
Amharic English Machine Translation Corpus prepared through website crawelling and custom preprocessing.
ehsanasgari/1000Langs
Creating super-parallel corpora of more than 1500+ unique languages for NLP research
asahi417/lm-vocab-trimmer
Vocabulary Trimming (VT) is a model compression technique, which reduces a multilingual LM vocabulary to a target language by deleting irrelevant tokens from its vocabulary. This repository contains a python-library vocabtrimmer, that remove irrelevant tokens from a multilingual LM vocabulary for the target language.
konstantinjdobler/focus
[EMNLP 2023] Official Code for "FOCUS: Effective Embedding Initialization for Monolingual Specialization of Multilingual Models"
masakhane-io/masakhane-news
MasakhaNEWS: News Topic Classification for African Languages
stefan-it/xlm-v-experiments
Experiments for XLM-V Transformers Integeration
castorini/AfriTeVa-keji
mainlp/How-to-distill-your-BERT
Code for the paper: How to Distill your BERT: An Empirical Study on the Impact of Weight Initialisation and Distillation Objectives (ACL 2023)
wulinjuan/Struct-XLM
The code of paper accepted in EMNLP2023, Struct-XLM: A Structure Discovery Multilingual Language Model for Enhancing Cross-lingual Transfer through Reinforcement Learning.
cisnlp/Taxi1500
gpengzhi/CrossConST-SR
Code for EMNLP 2023 industry track paper "Learning Multilingual Sentence Representations with Cross-lingual Consistency Regularization"
ZindiAfrica/Natural-Language-Processing-NLP
Machine Translation, ASR, Sentiment Analysis, Classification solutions
parovicm/unified-xlt
ashatilov/zindi_masakhane_pos
Code for Lacuna Masakhane Parts of Speech Classification Challenge
zsquaredz/joint_multilingual_analysis
This repository contains code for the paper "A Joint Matrix Factorization Analysis of Multilingual Representations" which appears in EMNLP2023 Findings
mo-arvan/scholarly-metadata