dariadiatlova

voice dl researcher

@deepvkSaint-Petersburg

dariadiatlova's Stars

Textualize/rich
Rich is a Python library for rich text and beautiful formatting in the terminal.
Language:Python49.4k 537 1.3k1.7k
facebookresearch/seamless_communication
Foundational Models for State-of-the-Art Speech and Text Translation
Language:Jupyter Notebook10.9k 140 3571.1k
netease-youdao/EmotiVoice
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
Language:Python7.4k 62 152630
open-mmlab/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
Language:Jupyter Notebook7k 76 183517
metavoiceio/metavoice-src
Foundational model for human-like, expressive TTS
Language:Python3.9k 78 127657
Stability-AI/stable-audio-tools
Generative models for conditional audio generation
Language:Python2.7k 43 96253
ddlBoJack/emotion2vec
[ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation
Language:Python619 15 4343
lucidrains/voicebox-pytorch
Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch
Language:Python608 47 2551
TaoRuijie/ECAPA-TDNN
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)
Language:Python603 4 82114
facebookresearch/textlesslib
Library for Textless Spoken Language Processing
Language:Python528 16 2451
audeering/w2v2-how-to
How to use our public wav2vec2 dimensional emotion model
Language:Jupyter Notebook454 9 1647
facebookresearch/SONAR
SONAR, a new multilingual and multimodal fixed-size sentence embedding space, with a full suite of speech and text encoders and decoders.
Language:Python338 14 1934
X-LANCE/VoiceFlow-TTS
[ICASSP 2024] This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"
Language:Python306 15 1721
p0p4k/pflowtts_pytorch
Unofficial implementation of NVIDIA P-Flow TTS paper
Language:Python217 14 4230
jishengpeng/Languagecodec
Language-Codec: Reducing the Gaps Between Discrete Codec Representation and Speech Language Models
Language:Python208 8 716
keonlee9420/DailyTalk
Official repository of DailyTalk: Spoken Dialogue Dataset for Conversational Text-to-Speech, ICASSP 2023
Language:Python201 8 313
google-research-datasets/cvss
CVSS: A Massively Multilingual Speech-to-Speech Translation Corpus
181 13 214
corl-team/rebased
Official implementation of the paper "Linear Transformers with Learnable Kernel Functions are Better In-Context Models"
Language:Python156 5 43
theodorblackbird/lina-speech
lina-speech : linear attention based text-to-speech
Language:Jupyter Notebook125 12 810
X-LANCE/UniCATS-CTX-vec2wav
[AAAI 2024] Code for CTX-vec2wav in UniCATS
Language:Python122 10 916
nii-yamagishilab/ZMM-TTS
ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations
Language:C120 5 69
shang0712/HierTTS
Language:Python44 7 310
deepvk/NISQA-s
Language:Python35 2 20
ECNU-Cross-Innovation-Lab/ShiftSER
[ICASSP 2023] Mingling or Misalignment? Temporal Shift for Speech Emotion Recognition with Pre-trained Representations
Language:Python34 2 22
nivibilla/efficient-vits-finetuning
Finetuning VITS Efficiently
Language:Python32 4 36
Lallapallooza/fast-audiomentations
⚡ Blazing fast audio augmentation in Python, powered by GPU for high-efficiency processing in machine learning and audio analysis tasks.
Language:Python31 3 01
EMOsuperb/EMO-SUPERB-submission
EMO-SUPERB submission
Language:Python28 4 02
HappyColor/Vesper
A Compact and Effective Pretrained Model for Speech Emotion Recognition
Language:Python27 3 41
msplabresearch/MSP-Podcast_Challenge
MSP-Podcast Challenge Baseline Code
Language:Python13 2 35
deepvk/muse
🎵 muse: Music Separation
Language:Python10 3 01

dariadiatlova

dariadiatlova's Stars

Textualize/rich

facebookresearch/seamless_communication

netease-youdao/EmotiVoice

open-mmlab/Amphion

metavoiceio/metavoice-src

Stability-AI/stable-audio-tools

ddlBoJack/emotion2vec

lucidrains/voicebox-pytorch

TaoRuijie/ECAPA-TDNN

facebookresearch/textlesslib

audeering/w2v2-how-to

facebookresearch/SONAR

X-LANCE/VoiceFlow-TTS

p0p4k/pflowtts_pytorch

jishengpeng/Languagecodec

keonlee9420/DailyTalk

google-research-datasets/cvss

corl-team/rebased

theodorblackbird/lina-speech

X-LANCE/UniCATS-CTX-vec2wav

nii-yamagishilab/ZMM-TTS

shang0712/HierTTS

deepvk/NISQA-s

ECNU-Cross-Innovation-Lab/ShiftSER

nivibilla/efficient-vits-finetuning

Lallapallooza/fast-audiomentations

EMOsuperb/EMO-SUPERB-submission

HappyColor/Vesper

msplabresearch/MSP-Podcast_Challenge

deepvk/muse