Pinned Repositories
ailia-models
The collection of pre-trained, state-of-the-art AI models for ailia SDK
all-in-one
All-In-One Music Structure Analyzer
audio_tagging_onnx
Easy to use Audio Tagging in Onnx
bark
🔊 Text-Prompted Generative Audio Model
beat_tracker
Beat tracker assignment for Music Informatics
DPCRN
real-time speech enhance
DTLN
real-time speech enhance
filtfilt
voice-blur with filtfilt,forward-pass backward-pass
Sentiment-classification
LSTM Sentiment-classification
speech-music-detection
tensorflow for speech-music-detection task,acc 96%+
zqlsnr's Repositories
zqlsnr/DPCRN
real-time speech enhance
zqlsnr/DTLN
real-time speech enhance
zqlsnr/speech-music-detection
tensorflow for speech-music-detection task,acc 96%+
zqlsnr/audio_tagging_onnx
Easy to use Audio Tagging in Onnx
zqlsnr/filtfilt
voice-blur with filtfilt,forward-pass backward-pass
zqlsnr/ailia-models
The collection of pre-trained, state-of-the-art AI models for ailia SDK
zqlsnr/all-in-one
All-In-One Music Structure Analyzer
zqlsnr/bark
🔊 Text-Prompted Generative Audio Model
zqlsnr/beat_tracker
Beat tracker assignment for Music Informatics
zqlsnr/Bert-VITS2
vits2 backbone with bert
zqlsnr/EmotiVoice
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
zqlsnr/GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
zqlsnr/MetaGPT
🌟 The Multi-Agent Framework: Given one line Requirement, return PRD, Design, Tasks, Repo
zqlsnr/OpenVoice
Instant voice cloning by MyShell
zqlsnr/RepCodec
Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization
zqlsnr/Sentiment-classification
LSTM Sentiment-classification
zqlsnr/shared_debugging_code
zqlsnr/soundstorm-speechtokenizer
Implementation of SoundStorm built upon SpeechTokenizer.
zqlsnr/SpeechTokenizer
This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on
zqlsnr/BrushNet
[ECCV 2024] The official implementation of paper "BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion"
zqlsnr/CED_audiotagging
Source code for Consistent ensemble distillation for audio tagging
zqlsnr/chorus-detection
A machine learning project for automated chorus detection in songs, featuring a command-line interface (CLI) tool that allows users to input a YouTube link and utilize a pre-trained CRNN model to detect chorus sections from a song on YouTube
zqlsnr/FoleyCrafter
FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds. AI拟音大师,给你的无声视频添加生动而且同步的音效 😝
zqlsnr/FunCodec
FunCodec is a research-oriented toolkit for audio quantization and downstream applications, such as text-to-speech synthesis, music generation et.al.
zqlsnr/grok-1
Grok open release
zqlsnr/gtcrn
The official implementation of GTCRN, an ultra-lite speech enhancement model.
zqlsnr/Kolors-TensorRT-libtorch
Kolors with TensorRT and libtorch
zqlsnr/MoneyPrinter
Automate Creation of YouTube Shorts using MoviePy.
zqlsnr/PaddleVideo
Awesome video understanding toolkits based on PaddlePaddle. It supports video data annotation tools, lightweight RGB and skeleton based action recognition model, practical applications for video tagging and sport action detection.
zqlsnr/TVSM-dataset