speech-translation

There are 55 repositories under speech-translation topic.

NVIDIA/NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Language:Python12.6k 210 2.3k2.6k
PaddlePaddle/PaddleSpeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
Language:Python11.3k 185 1.9k1.9k
espnet/espnet
End-to-End Speech Processing Toolkit
Language:Python8.6k 177 2.4k2.2k
huggingface/speech-to-speech
Speech To Speech: an effort for an open-sourced and modular GPT4-o
Language:Python3.6k 45 92389
microsoft/SpeechT5
Unified-Modal Speech-Text Pre-Training for Spoken Language Processing
Language:Python1.2k 24 90117
ictnlp/StreamSpeech
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
Language:Python985 13 1575
zhangshaolei1998/Awesome-Simultaneous-Translation
Paper list of simultaneous translation / streaming translation, including text-to-text machine translation and speech-to-text translation.
571 27 17
Dadangdut33/Speech-Translate
A realtime speech transcription and translation application using Whisper OpenAI and free translation API. Interface made using Tkinter. Code written fully in Python.
Language:Python524 18 6764
double22a/speech_dataset
The dataset of Speech Recognition
394 9 376
echogarden-project/echogarden
Cross-platform speech toolset, used from the command-line or as a Node.js library. Includes a variety of engines for speech synthesis, speech recognition, forced alignment, speech translation, voice isolation, language detection and more.
Language:TypeScript256 5 7826
kahne/SpeechTransProgress
Tracking the progress in end-to-end speech translation
255 27 225
MooreThreads/MooER
MooER: Moore-threads Open Omni model for speech-to-speech intERaction. MooER-omni includes a series of end-to-end speech interaction models along with training and inference code, covering but not limited to end-to-end speech interaction, end-to-end speech translation and speech recognition.
Language:Python179 8 2212
dqqcasia/awesome-speech-translation
177 13 01
bzhangGo/zero
Zero -- A neural machine translation system
Language:Python150 5 819
ReneeYe/ConST
code for paper "Cross-modal Contrastive Learning for Speech Translation" (NAACL 2022)
Language:Python62 2 86
ictnlp/DASpeech
Code for NeurIPS 2023 paper "DASpeech: Directed Acyclic Transformer for Fast and High-quality Speech-to-Speech Translation".
Language:Python61 4 65
hlt-mt/FBK-fairseq
Repository containing the open source code of works published at the FBK MT unit.
Language:Python42 6 71
mt-upc/SHAS
SHAS: Approaching optimal Segmentation for End-to-End Speech Translation
Language:Python37 6 54
ictnlp/STEMM
Code for ACL 2022 main conference paper "STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation".
Language:Python36 2 57
Rongjiehuang/awesome-speech-to-speech-translation
List of direct speech-to-speech translation papers.
36 5 02
ictnlp/DiSeg
Source code for ACL 2023 paper "End-to-End Simultaneous Speech Translation with Differentiable Segmentation"
Language:Python33 3 22
George0828Zhang/torch_cif
A fast parallel PyTorch implementation of the "CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition" https://arxiv.org/abs/1905.11235.
Language:Python32 3 13
liamdugan/speech-to-speech
Code for the INTERSPEECH 2023 paper "Learning When to Speak: Latency and Quality Trade-offs for Simultaneous Speech-to-Speech Translation with Offline Models"
Language:Python28 2 06
ictnlp/ComSpeech
Code for ACL 2024 main conference paper "Can We Achieve High-quality Direct Speech-to-Speech Translation Without Parallel Speech Data?".
Language:Python24 6 36
mt-upc/ZeroSwot
Pushing the Limits of Zero-shot End-to-End Speech Translation
Language:Python24 12 03
George0828Zhang/simulst
PyTorch toolkit for streaming speech recognition, speech translation and simultaneous translation based on fairseq.
Language:Python22 3 13
ReneeYe/XSTNet
This is an implementation of paper "End-to-end Speech Translation via Cross-modal Progressive Training" (Interspeech2021)
Language:Python20 2 23
VinAIResearch/PhoST
A High-Quality and Large-Scale Dataset for English-Vietnamese Speech Translation (INTERSPEECH 2022)
20 3 02
KevKibe/African-Whisper
🚀 Framework for seamless fine-tuning of Whisper model on a multi-lingual dataset and deployment to prod.
Language:Python19 2 213
ictnlp/CRESS
Code for ACL 2023 main conference paper "Understanding and Bridging the Modality Gap for Speech Translation".
Language:Python17 3 72
ictnlp/ITST
Code for EMNLP 2022 main conference paper "Information-Transport-based Policy for Simultaneous Translation"
Language:Python14 1 12
ictnlp/BT4ST
Code for ACL 2023 main conference paper "Back Translation for Speech-to-text Translation Without Transcripts".
Language:Python13 3 32
bzhangGo/st_from_scratch
Revisiting End-to-End Speech-to-Text Translation From Scratch
Language:Python12 1 04
Giuseppe-Della-Corte/IESTAC
A corpus that can be used to train English-to-Italian End-to-End Speech-to-Text Machine Translation models
11 1 01
xuchennlp/S2T
The project for speech translation
Language:Python11 2 12
JeffWang0325/Microsoft-Azure-Cognitive-Services
🖍️ This project combines multiple operations in Microsoft Azure Cognitive Services into one GUI, including QnA Maker, LUIS, Computer Vision, Custom Vision, Face, Form Recognizer, Text To Speech, Speech To Text and Speech Translation. It's very user-friendly for users to implement any operation mentioned above.
Language:C#9 2 06

speech-translation

NVIDIA/NeMo

PaddlePaddle/PaddleSpeech

espnet/espnet

huggingface/speech-to-speech

microsoft/SpeechT5

ictnlp/StreamSpeech

zhangshaolei1998/Awesome-Simultaneous-Translation

Dadangdut33/Speech-Translate

double22a/speech_dataset

echogarden-project/echogarden

kahne/SpeechTransProgress

MooreThreads/MooER

dqqcasia/awesome-speech-translation

bzhangGo/zero

ReneeYe/ConST

ictnlp/DASpeech

hlt-mt/FBK-fairseq

mt-upc/SHAS

ictnlp/STEMM

Rongjiehuang/awesome-speech-to-speech-translation

ictnlp/DiSeg

George0828Zhang/torch_cif

liamdugan/speech-to-speech

ictnlp/ComSpeech

mt-upc/ZeroSwot

George0828Zhang/simulst

ReneeYe/XSTNet

VinAIResearch/PhoST

KevKibe/African-Whisper

ictnlp/CRESS

ictnlp/ITST

ictnlp/BT4ST

bzhangGo/st_from_scratch

Giuseppe-Della-Corte/IESTAC

xuchennlp/S2T

JeffWang0325/Microsoft-Azure-Cognitive-Services