dariadiatlova

voice dl researcher

@deepvkSaint-Petersburg

dariadiatlova's Stars

wenet-e2e/speech-synthesis-paper
List of speech synthesis papers.
989120
enhuiz/vall-e
An unofficial PyTorch implementation of the audio LM VALL-E
Language:Python2.9k417
borisshapa/inception-v3-numpy
Implementation of the popular network Inception v3 on Numpy. Implementation of the AdaSmooth optimizer. Comparison of optimizers on Cars dataset.
Language:Jupyter Notebook1
HumanSignal/label-studio
Label Studio is a multi-type data labeling and annotation tool with standardized output format
Language:JavaScript18.2k2.3k
microsoft/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Language:Python19.5k2.5k
keonlee9420/Comprehensive-Transformer-TTS
A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate TTS
Language:Python31841
salute-developers/golos
Language:Python11312
microsoft/CLAP
Learning audio concepts from natural language supervision
Language:Python45735
lingjzhu/CharsiuG2P
Multilingual G2P in 100 languages
Language:Jupyter Notebook27424
deepvk/vitrina
👀 VITRina: VIsual Token Representations
Language:Python103
MasayaKawamura/MB-iSTFT-VITS
Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform
Language:Python41264
diff-usion/Awesome-Diffusion-Models
A collection of resources and papers on Diffusion Models
Language:HTML10.8k929
audeering/opensmile
The Munich Open-Source Large-Scale Multimedia Feature Extractor
Language:C++57075
chrisdonahue/sheetsage
Transcribe music into lead sheets!
Language:Python29264
Howuhh/sac-n-jax
Single-file SAC-N implementation on jax with flax and equinox. 10x faster than pytorch
Language:Python463
huggingface/diffusion-models-class
Materials for the Hugging Face Diffusion Models Course
Language:Jupyter Notebook3.5k379
tsurumeso/vocal-remover
Vocal Remover using Deep Neural Networks
Language:Python1.5k221
magenta/music-spectrogram-diffusion
Language:Jupyter Notebook38027
POZAlabs/ComMU-code
[NeurIPS'22] Official code of "ComMU: Dataset for Combinatorial Music Generation"
Language:Python13926
YatingMusic/remi
"Pop Music Transformer: Beat-based Modeling and Generation of Expressive Pop Piano Compositions", ACM Multimedia 2020
Language:Python54685
keonlee9420/Expressive-FastSpeech2
PyTorch Implementation of Non-autoregressive Expressive (emotional, conversational) TTS based on FastSpeech2, supporting English, Korean, and your own languages.
Language:Python27648
hujinsen/pytorch-StarGAN-VC
Fully reproduce the paper of StarGAN-VC. Stable training and Better audio quality .
Language:Python24457
suzuki256/dog-dataset
Language:Python401
tinkoff-ai/palbert
Code for the paper "PALBERT: Teaching ALBERT to Ponder", NeurIPS 2022 Spotlight
Language:Python371
microsoft/muzic
Muzic: Music Understanding and Generation with Artificial Intelligence
Language:Python4.5k434
openai/jukebox
Code for the paper "Jukebox: A Generative Model for Music"
Language:Python7.8k1.4k
maum-ai/phaseaug
ICASSP 2023 Accepted
Language:Python18914
pyannote/pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Language:Jupyter Notebook5.9k754
CompVis/latent-diffusion
High-Resolution Image Synthesis with Latent Diffusion Models
Language:Jupyter Notebook11.5k1.5k
openai/guided-diffusion
Language:Python6.1k808

dariadiatlova

dariadiatlova's Stars

wenet-e2e/speech-synthesis-paper

enhuiz/vall-e

borisshapa/inception-v3-numpy

HumanSignal/label-studio

microsoft/unilm

keonlee9420/Comprehensive-Transformer-TTS

salute-developers/golos

microsoft/CLAP

lingjzhu/CharsiuG2P

deepvk/vitrina

MasayaKawamura/MB-iSTFT-VITS

diff-usion/Awesome-Diffusion-Models

audeering/opensmile

chrisdonahue/sheetsage

Howuhh/sac-n-jax

huggingface/diffusion-models-class

tsurumeso/vocal-remover

magenta/music-spectrogram-diffusion

POZAlabs/ComMU-code

YatingMusic/remi

keonlee9420/Expressive-FastSpeech2

hujinsen/pytorch-StarGAN-VC

suzuki256/dog-dataset

tinkoff-ai/palbert

microsoft/muzic

openai/jukebox

maum-ai/phaseaug

pyannote/pyannote-audio

CompVis/latent-diffusion

openai/guided-diffusion