josephsuccar's Stars
xir4n/dtcwt
Python port of the Dual-Tree Complex Wavelet Transform toolbox for MATLAB
HigherOrderCO/Bend
A massively parallel, high-level programming language
andylolu2/simpleGEMM
The simplest but fast implementation of matrix multiplication in CUDA.
Rikorose/DeepFilterNet
Noise supression using deep filtering
mmathew23/improved_edm
Implementation of "Analyzing and Improving the Training Dynamics of Diffusion Models"
Rayhane-mamah/Efficient-VDVAE
Official Pytorch and JAX implementation of "Efficient-VDVAE: Less is more"
mosaicml/diffusion
atong01/conditional-flow-matching
TorchCFM: a Conditional Flow Matching library
facebookresearch/audioseal
Localized watermarking for AI-generated speech audios, with SOTA on robustness and very fast detector
eiz/SynchronousAudioRouter
Low latency application audio routing for Windows
DioxusLabs/dioxus
Fullstack app framework for web, desktop, mobile, and more.
RustAudio/cpal
Cross-platform audio I/O library in pure Rust
LukeMathWalker/zero-to-production
Code for "Zero To Production In Rust", a book on API development using Rust.
state-spaces/mamba
Mamba SSM architecture
alesaccoia/VoiceStreamAI
Near-Realtime audio transcription using self-hosted Whisper and WebSocket in Python/JS
open-mmlab/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
huggingface/distil-whisper
Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
coqui-ai/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
bytedance/SALMONN
SALMONN: Speech Audio Language Music Open Neural Network
descriptinc/descript-audio-codec
State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.
jeanfeydy/geomloss
Geometric loss functions between point clouds, images and volumes
google/lyra
A Very Low-Bitrate Codec for Speech Compression
francois-rozet/piqa
PyTorch Image Quality Assessement package
snakers4/silero-vad
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
libAudioFlux/audioFlux
A library for audio and music analysis, feature extraction.
google-research/tuning_playbook
A playbook for systematically maximizing the performance of deep learning models.
zademn/mnist-mlops-learning
In this project I played with mlflow, streamlit and fastapi to create a training and prediction app on digits
openai/jukebox
Code for the paper "Jukebox: A Generative Model for Music"