michael-kuhlmann

PhD Student at Paderborn University voice conversion, speech synthesis, voice profiling

Paderborn UniversityPaderborn

michael-kuhlmann's Stars

archinetai/a-unet
A toolbox that provides hackable building blocks for generic 1D/2D/3D UNets, in PyTorch.
Language:Python749
archinetai/audio-diffusion-pytorch
Audio generation using diffusion models, in PyTorch.
Language:Python1.9k163
NVIDIA/NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Language:Python11k2.3k
audiolabs/webMUSHRA
a MUSHRA compliant web audio API based experiment software
Language:JavaScript329129
microsoft/P.808
This is an open-source implementation of the ITU P.808 standard for "Subjective evaluation of speech quality with a crowdsourcing approach" (see https://www.itu.int/rec/T-REC-P.808/en). It uses Amazon Mechanical Turk as the crowdsourcing platform. It includes implementations for Absolute Category Rating (ACR), Degradation Category Rating (DCR), and Comparison Category Rating (CCR).
Language:HTML20058
s3prl/s3prl
Self-Supervised Speech Pre-training and Representation Learning Toolkit
Language:Python2.2k479
auspicious3000/contentvec
speech self-supervised representations
Language:Python43734
Alexander-H-Liu/dinosr
DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning
Language:Python434
RameenAbdal/StyleFlow
StyleFlow: Attribute-conditioned Exploration of StyleGAN-generated Images using Conditional Continuous Normalizing Flows (ACM TOG 2021)
Language:Python2.4k344
facebookresearch/voxpopuli
A large-scale multilingual speech corpus for representation learning, semi-supervised learning and interpretation
Language:Python50150
phizaz/diffae
Official implementation of Diffusion Autoencoders
Language:Jupyter Notebook819123
openai/guided-diffusion
Language:Python5.9k782
DiffEqML/torchdyn
A PyTorch library entirely dedicated to neural differential equations, implicit models and related numerical methods
Language:Jupyter Notebook1.3k125
stefanwebb/flowtorch
This library would form a permanent home for reusable components for deep probabilistic programming. The library would form and harness a community of users and contributors by focusing initially on complete infra and documentation for how to use and create components.
Language:Jupyter Notebook71
suno-ai/bark
🔊 Text-Prompted Generative Audio Model
Language:Jupyter Notebook33.9k4k
bshall/knn-vc
Voice Conversion With Just Nearest Neighbors
Language:Python43164
huggingface/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Language:Python130k25.7k
huggingface/diffusers
🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
Language:Python24.2k5k
fgnt/meeteval
MeetEval - A meeting transcription evaluation toolkit
Language:Python6913
nvidia-riva/riva-asrlib-decoder
Standalone implementation of the CUDA-accelerated WFST Decoder available in Riva
Language:Python7823
kan-bayashi/LibriTTSLabel
Alignment files of LibriTTS.
557
facebookresearch/encodec
State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.
Language:Python3.3k299
bootphon/phonemizer
Simple text to phones converter for multiple languages
Language:Python1.2k163
dmort27/panphon
Python package and data files for manipulating phonological segments (phones, phonemes) in terms of universal phonological features.
Language:Python20841
lingjzhu/CharsiuG2P
Multilingual G2P in 100 languages
Language:Jupyter Notebook26825
jaywalnut310/vits
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
Language:Python6.5k1.2k
xinjli/allosaurus
Allosaurus is a pretrained universal phone recognizer for more than 2000 languages
Language:Python53285
IDRnD/VoxTube
The VoxTube dataset official repository
Language:HTML571
cvqluu/Angular-Penalty-Softmax-Losses-Pytorch
Angular penalty loss functions in Pytorch (ArcFace, SphereFace, Additive Margin, CosFace)
Language:Python47892
google/gin-config
Gin provides a lightweight configuration framework for Python
Language:Python2k120

michael-kuhlmann

michael-kuhlmann's Stars

archinetai/a-unet

archinetai/audio-diffusion-pytorch

NVIDIA/NeMo

audiolabs/webMUSHRA

microsoft/P.808

s3prl/s3prl

auspicious3000/contentvec

Alexander-H-Liu/dinosr

RameenAbdal/StyleFlow

facebookresearch/voxpopuli

phizaz/diffae

openai/guided-diffusion

DiffEqML/torchdyn

stefanwebb/flowtorch

suno-ai/bark

bshall/knn-vc

huggingface/transformers

huggingface/diffusers

fgnt/meeteval

nvidia-riva/riva-asrlib-decoder

kan-bayashi/LibriTTSLabel

facebookresearch/encodec

bootphon/phonemizer

dmort27/panphon

lingjzhu/CharsiuG2P

jaywalnut310/vits

xinjli/allosaurus

IDRnD/VoxTube

cvqluu/Angular-Penalty-Softmax-Losses-Pytorch

google/gin-config