oryosu's Stars
ufal/whisper_streaming
Whisper realtime streaming for long speech-to-text transcription and translation
yxlllc/DDSP-SVC
Real-time end-to-end singing voice conversion system based on DDSP (Differentiable Digital Signal Processing)
bshall/knn-vc
Voice Conversion With Just Nearest Neighbors
mmorise/World
A high-quality speech analysis, manipulation and synthesis system
google/oboe
Oboe is a C++ library that makes it easy to build high-performance audio apps on Android.
RustAudio/cpal
Cross-platform audio I/O library in pure Rust
gemelo-ai/vocos
Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis
sarulab-speech/UTMOS22
UT-Sarulab MOS prediction system using SSL models
facebookresearch/audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
llvm/torch-mlir
The Torch-MLIR project aims to provide first class support from the PyTorch ecosystem to the MLIR ecosystem.
shiguredo/sora-unity-sdk
WebRTC SFU Sora Unity SDK
astral-sh/rye
a Hassle-Free Python Experience
xiph/opus
Modern audio compression for the internet.
PlayVoice/lora-svc
singing voice change based on whisper, and lora for singing voice clone
WangHelin1997/MaskSpec
The Pytorch implementation of paper: Masked Spectrogram Prediction For Self-Supervised Audio Pre-Training
MWM-io/SpecTNT-pytorch
Unofficial implementation of SpecTNT in pytorch
rkmt/summarize_arxv
facebookresearch/ImageBind
ImageBind One Embedding Space to Bind Them All
Vaibhavs10/fast-whisper-finetuning
solidiquis/erdtree
A modern, cross-platform, multi-threaded, and general purpose filesystem and disk-usage utility that is aware of .gitignore and hidden file rules.
AndreyGuzhov/AudioCLIP
Source code for models described in the paper "AudioCLIP: Extending CLIP to Image, Text and Audio" (https://arxiv.org/abs/2106.13043)
suno-ai/bark
🔊 Text-Prompted Generative Audio Model
google/clasp
🔗 Command Line Apps Script Projects
archinetai/audio-ai-timeline
A timeline of the latest AI models for audio generation, starting in 2023!
RVC-Project/Retrieval-based-Voice-Conversion-WebUI
Easily train a good VC model with voice data <= 10 mins!
kamepong/ConvS2S-VC
ggerganov/whisper.cpp
Port of OpenAI's Whisper model in C/C++
archinetai/audio-diffusion-pytorch
Audio generation using diffusion models, in PyTorch.
nnsvs/nnsvs
Neural network-based singing voice synthesis library for research
markowanga/stweet
Advanced python library to scrap Twitter (tweets, users) from unofficial API