julius-richter

PhD student at Universität Hamburg working on deep generative models for speech enhancement.

Hamburg, Berlin

julius-richter's Stars

probml/pml-book
"Probabilistic Machine Learning" - a book series by Kevin Murphy
Language:Jupyter Notebook5.1k 88 660601
haoheliu/AudioLDM
AudioLDM: Generate speech, sound effects, music and beyond, with text.
Language:Python2.5k 44 111227
state-spaces/s4
Structured state space sequence models
Language:Jupyter Notebook2.5k 53 139305
gnobitab/RectifiedFlow
Official Implementation of Rectified Flow (ICLR2023 Spotlight)
Language:Python1.1k 11 2662
baofff/U-ViT
A PyTorch implementation of the paper "All are Worth Words: A ViT Backbone for Diffusion Models".
Language:Jupyter Notebook957 12 2865
facebookresearch/av_hubert
A self-supervised learning framework for audio-visual speech
Language:Python865 15 111138
shivammehta25/Matcha-TTS
[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching
Language:Jupyter Notebook829 17 89105
krantiparida/awesome-audio-visual
A curated list of different papers and datasets in various areas of audio-visual processing
686 18 268
Yuan-ManX/ai-audio-datasets
AI Audio Datasets (AI-ADS) 🎵, including Speech, Music, and Sound Effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio applications.
596 14 245
sp-uhh/sgmse
Score-based Generative Models (Diffusion Models) for Speech Enhancement and Dereverberation
Language:Python557 12 6477
researchmm/MM-Diffusion
[CVPR'23] MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation
Language:Python409 6 2223
celebv-text/CelebV-Text
(CVPR 2023) CelebV-Text: A Large-Scale Facial Text-Video Dataset
Language:Python394 13 2733
ruizhecao96/CMGAN
Conformer-based Metric GAN for speech enhancement
Language:Python337 9 4760
sihyun-yu/PVDM
Official PyTorch implementation of Video Probabilistic Diffusion Models in Projected Latent Space (CVPR 2023).
Language:Python314 13 3716
NVIDIA/CleanUNet
Official PyTorch Implementation of CleanUNet (ICASSP 2022)
Language:Python302 11 051
neillu23/CDiffuSE
Conditional Diffusion Probabilistic Model for Speech Enhancement
Language:Python224 8 1435
sp-uhh/storm
StoRM: A Diffusion-based Stochastic Regeneration Model for Speech Enhancement and Dereverberation
Language:Python202 11 2426
x4nth055/gender-recognition-by-voice
Building a Deep learning model that predicts the gender of a speaker using TensorFlow 2
Language:Python115 7 542
RoySheffer/im2wav
Official implementation of the pipeline presented in I hear your true colors: Image Guided Audio Generation
Language:Python110 3 1310
facebookresearch/EasyComDataset
The Easy Communications (EasyCom) dataset is a world-first dataset designed to help mitigate the *cocktail party effect* from an augmented-reality (AR) -motivated multi-sensor egocentric world view.
107 10 77
YUCHEN005/NASE
Code for paper "Noise-aware Speech Enhancement using Diffusion Probabilistic Model"
Language:Python77 3 62
ahaliassos/raven
Official implementation of RAVEn (ICLR 2023) and BRAVEn (ICASSP 2024)
Language:Python60 9 95
sukun1045/video-physics-sound-diffusion
Language:Python45 2 63
sp-uhh/sgmse-bbed
TODO
Language:Python37 2 28
YangangCao/Causal-U-Net
unofficial PyTorch implementation of 《A Causal U-net based Neural Beamforming Network for Real-Time Multi-Channel Speech Enhancement》
Language:Python33 3 27
hmartelb/avlit
Official source code of the INTERSPEECH 2023 paper: "Audio-Visual Speech Separation in Noisy Environments with a Lightweight Iterative Model" (AVLIT)
Language:Python19 2 61
taketakeseijin/HarmonicLowering
Implementation of Harmonic Convolution by Harmonic Lowering
Language:Python17 1 32
sp-uhh/guided-vae-nmf
This is the repository of the paper
Language:Jupyter Notebook8 3 00

julius-richter

julius-richter's Stars

probml/pml-book

haoheliu/AudioLDM

state-spaces/s4

gnobitab/RectifiedFlow

baofff/U-ViT

facebookresearch/av_hubert

shivammehta25/Matcha-TTS

krantiparida/awesome-audio-visual

Yuan-ManX/ai-audio-datasets

sp-uhh/sgmse

researchmm/MM-Diffusion

celebv-text/CelebV-Text

ruizhecao96/CMGAN

sihyun-yu/PVDM

NVIDIA/CleanUNet

neillu23/CDiffuSE

sp-uhh/storm

x4nth055/gender-recognition-by-voice

RoySheffer/im2wav

facebookresearch/EasyComDataset

YUCHEN005/NASE

ahaliassos/raven

sukun1045/video-physics-sound-diffusion

sp-uhh/sgmse-bbed

YangangCao/Causal-U-Net

hmartelb/avlit

taketakeseijin/HarmonicLowering

sp-uhh/guided-vae-nmf