Pinned Repositories
3D-sound-panorama
Panoraming a sound object in 3D using HRTF
3dti_AudioToolkit
3D Tune-In Toolkit is a custom open-source C++ library developed within the EU-funded project 3D Tune-In. The Toolkit provides a high level of realism and immersiveness within binaural 3D audio simulations, while allowing for the emulation of hearing aid devices and of different typologies of hearing loss.
A-Convolutional-Recurrent-Neural-Network-for-Real-Time-Speech-Enhancement
Implement A Convolutional Recurrent Neural Network for Real-Time Speech Enhancement by PyTorch.
acoustic-model
Acoustic models for: A Comparison of Discrete and Soft Speech Units for Improved Voice Conversion
AEC-Challenge
AEC Challenge
agc
Provides automatic gain control to normalize power levels for real or complex signals
microphoneArray
open-unmix-tensorflow
open unmix - music source separation for tensorflow
pianotrans
Simple GUI for ByteDance's Piano Transcription with Pedals
PitchNet
An unofficial implementation of the paper titled "PitchNet: Unsupervised Singing Voice Conversion with Pitch Adversarial Network".
xiaozhuo12138's Repositories
xiaozhuo12138/Applio
VITS-based Voice Conversion focused on simplicity, quality and performance.
xiaozhuo12138/AudioLDM2
Text-to-Audio/Music Generation
xiaozhuo12138/audiolm-pytorch
Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch
xiaozhuo12138/BABE2
xiaozhuo12138/beat_this
Accurate and general beat tracker
xiaozhuo12138/beatrice-trainer-colab
xiaozhuo12138/ccmusic-database.github.io
This platform is a multi-functional music data sharing platform for academic research. It contains many music datas such as the sound information of Chinese traditional musical instruments and the labeling information of Chinese pop music, which is available for free use by MIR researchers.
xiaozhuo12138/ChordSync
Code for ChordSync, a conformer-based audio-to-chord synchroniser
xiaozhuo12138/CoMoSVC
CoMoSVC: One-Step Consistency Model Based Singing Voice Conversion & Singing Voice Clone
xiaozhuo12138/DelayCat
DelayCat Feature Based Delay Line Audio Plugin
xiaozhuo12138/dry_sing_multi_eval
Five-Dimensional Acapella Singing Evaluation System based on funASR, include pronunciation, pitch accuracy, rhythm, fluency, and emotion.
xiaozhuo12138/FCPE
xiaozhuo12138/FreeV
[InterSpeech 24] FreeV: Free Lunch For Vocoders Through Pseudo Inversed Mel Filter
xiaozhuo12138/FSPEN
xiaozhuo12138/FxNorm-automix
FxNorm-Automix - Implementation of automatic music mixing systems. We show how we can use wet music data and repurpose it to train a fully automatic mixing system
xiaozhuo12138/GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
xiaozhuo12138/grok-1
Grok open release
xiaozhuo12138/gtcrn
The official implementation of GTCRN, an ultra-lite speech enhancement model.
xiaozhuo12138/hilcodec
xiaozhuo12138/icefall
xiaozhuo12138/mustango
Mustango: Toward Controllable Text-to-Music Generation
xiaozhuo12138/muzic
Muzic: Music Understanding and Generation with Artificial Intelligence
xiaozhuo12138/Open-Sora
Open-Sora: Democratizing Efficient Video Production for All
xiaozhuo12138/openWakeWord
An open-source audio wake word (or phrase) detection framework with a focus on performance and simplicity.
xiaozhuo12138/POPDG
Data and PopDanceSet are coming soon.
xiaozhuo12138/sherpa-onnx
Speech-to-text, text-to-speech, speaker recognition, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript, Flutter, Object Pascal, Lazarus, Rust
xiaozhuo12138/stream-vc
An unofficial PyTorch implementation of the StreamVC(Real-Time Low-Latency Voice Conversion)
xiaozhuo12138/StreamVC
An unofficial pytorch implementation of "STREAMVC: REAL-TIME LOW-LATENCY VOICE CONVERSION".
xiaozhuo12138/tinyvc
a lightweight voice conversion
xiaozhuo12138/XNNPACK
High-efficiency floating-point neural network inference operators for mobile, server, and Web