low-resource-machine-translation

There are 14 repositories under low-resource-machine-translation topic.

  • csebuetnlp/banglanmt

    This repository contains the code and data of the paper titled "Not Low-Resource Anymore: Aligner Ensembling, Batch Filtering, and New Datasets for Bengali-English Machine Translation" published in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020), November 16 - November 20, 2020.

    Language:Python144101045
  • cambridgeltl/ContrastiveBLI

    Improving Word Translation via Two-Stage Contrastive Learning (ACL 2022). Keywords: Bilingual Lexicon Induction, Word Translation, Cross-Lingual Word Embeddings.

    Language:Python32909
  • Kartikaggarwal98/Indian_ParallelCorpus

    Curated list of publicly available parallel corpus for Indian Languages

  • yaoyiran/BLI-Reading-List

    A 2024 Reading List for Bilingual Lexicon Induction (BLI) / Word Translation. Frequently Updated.

    Language:Python20202
  • L1-Refinement

    Pzoom522/L1-Refinement

    Code for "Cross-Lingual Word Embedding Refinement by ℓ1 Norm Optimisation" (NAACL 2021)

    Language:Python16103
  • cambridgeltl/BLICEr

    Improving Bilingual Lexicon Induction with Cross-Encoder Reranking (Findings of EMNLP 2022). Keywords: Bilingual Lexicon Induction, Word Translation, Cross-Lingual Word Embeddings.

    Language:Python13703
  • clefourrier/CopperMT

    [ACL 2021, Findings] Cognate Prediction Per Machine Translation

    Language:JavaScript10310
  • machelreid/afromt

    Code for the EMNLP 2021 Paper "AfroMT: Pretraining Strategies and Reproducible Benchmarks for Translation of 8 African Languages" by Machel Reid, Junjie Hu, Graham Neubig, Yutaka Matsuo

    Language:Python9342
  • cambridgeltl/prompt4bli

    On Bilingual Lexicon Induction with Large Language Models (EMNLP 2023). Keywords: Bilingual Lexicon Induction, Word Translation, Large Language Models, LLMs.

    Language:Python8702
  • HenningBuhl/low-resource-machine-translation

    This repository is an open-source colleciton of various low-resource machine translation experiments.

    Language:Python5101
  • andrea-cavallo-98/Low-resource-Machine-Translation

    Multilingual finetuning of Machine Translation model on low-resource languages. Project for Deep Natural Language Processing course.

    Language:Jupyter Notebook4102
  • cambridgeltl/sail-bli

    Self-Augmented In-Context Learning for Unsupervised Word Translation (ACL 2024). Keywords: Bilingual Lexicon Induction, Word Translation, Large Language Models, LLMs.

    Language:Python1601
  • harshitadd/indicOCR

    Low-Resource OCR

    Language:Jupyter Notebook1200
  • steventan0110/ParaCrawl

    On-develop Bitext Mining Tool for low resource languages

    Language:Shell20