parallel-corpora
There are 17 repositories under parallel-corpora topic.
bitextor/bitextor
Bitextor generates translation memories from multilingual websites
csebuetnlp/banglanmt
This repository contains the code and data of the paper titled "Not Low-Resource Anymore: Aligner Ensembling, Batch Filtering, and New Datasets for Bengali-English Machine Translation" published in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020), November 16 - November 20, 2020.
tsuruoka-lab/BSD
The Business Scene Dialogue corpus
Kartikaggarwal98/Indian_ParallelCorpus
Curated list of publicly available parallel corpus for Indian Languages
timarkh/tsakorpus
Yet another search platform for linguistic corpora.
korenyoni/opus-api
OPUS (opus.nlpl.eu) Python3 API
Giuseppe-Della-Corte/IESTAC
A corpus that can be used to train English-to-Italian End-to-End Speech-to-Text Machine Translation models
tsuruoka-lab/AMI-Meeting-Parallel-Corpus
AMI Meeting Parallel Corpus
rggdmonk/hadal
A simple and efficient tool for mining and aligning sentences with pre-trained models.
gederajeg/constructional-equivalence
Repository of supplementary materials and RStudio project for the paper on corpus-based approach to measuring constructional equivalence.
czcorpus/ictools
A program for calculating corpora alignments using a pivot language
npedrazzini/parallelbibles
Word-alignment models for Bible translations in 100+ historical and contemporary languages
gederajeg/rob-steal-parallel-corpora
Repository kode pemrograman R dan data untuk analisis dalam penelitian dengan judul MODEL KAJIAN TERJEMAHAN BERBASIS BANK DATA TERJEMAHAN DIGITAL INGGRIS-INDONESIA DAN IMPLIKASI PEDAGOGISNYA
Nexdata-AI/1990000-Groups-Chinese-Czech-Parallel-Corpus-Data
1990000-Groups-Chinese-Czech-Parallel-Corpus-Data
techiaith/alinio
Cod hwyluso alinio testunau gyda hunalign a dogfennaeth ar sut i ddefnyddio LFAligner // Code for simplifying aligning texts with hunalign and documentation for LFAligner