LuffyGT's Stars
fishaudio/fish-speech
Brand new TTS solution
open-mmlab/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
modelscope/DiffSynth-Studio
Enjoy the magic of Diffusion models!
chyok/ten-drops
A ten drops game written in pygame-ce, sourced from the Flask game "Splash Back".
yeyupiaoling/Whisper-Finetune
Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deployment, Windows desktop deployment, and Android deployment
SpeechColab/Leaderboard
SpeechIO Leaderboard: a large, robust, comprehensive, benchmarking platform for Automatic Speech Recognition.
NextAudioGen/ultimatevocalremover_api
API for a Vocal Remover that uses Deep Neural Networks.
Anjok07/ultimatevocalremovergui
GUI for a Vocal Remover that uses Deep Neural Networks.
k2-fsa/k2
FSA/FST algorithms, differentiable, with PyTorch compatibility.
k2-fsa/sherpa-onnx
Speech-to-text, text-to-speech, speaker recognition, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript, Flutter, Object Pascal, Lazarus, Rust
RetroCirce/HTS-Audio-Transformer
The official code repo of "HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection"
1c7/chinese-independent-developer
👩🏿💻👨🏾💻👩🏼💻👨🏽💻👩🏻💻**独立开发者项目列表 -- 分享大家都在做什么
modelscope/kws-training-suite
coqui-ai/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
yt-dlp/yt-dlp
A feature-rich command-line audio/video downloader
aldragan0/voice-recognition
Voice-based gender, age and language recognition.
modelscope/FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
mikezzb/song-recognition
Full-stack song recognition application with audio fingerprinting and hum to search (QbSH) modules
ypwhs/CreativeChatGLM
👋 欢迎来到 ChatGLM 创意世界!你可以使用修订和续写的功能来生成创意内容!
mymusise/ChatGLM-Tuning
基于ChatGLM-6B + LoRA的Fintune方案
babysor/MockingBird
🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time
huggingface/pytorch-image-models
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
openai/whisper
Robust Speech Recognition via Large-Scale Weak Supervision
zhuzilin/whisper-openvino
openvino version of openai/whisper
PaddlePaddle/PaddleSpeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
wenet-e2e/wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit
espnet/espnet
End-to-End Speech Processing Toolkit
shibing624/pycorrector
pycorrector is a toolkit for text error correction. 文本纠错,实现了Kenlm,T5,MacBERT,ChatGLM3,LLaMA等模型应用在纠错场景,开箱即用。
LuffyGT/python-spider
:rainbow:Python3网络爬虫实战:淘宝、京东、网易云、B站、12306、抖音、笔趣阁、漫画小说下载、音乐电影下载等
Jack-Cherish/python-spider
:rainbow:Python3网络爬虫实战:淘宝、京东、网易云、B站、12306、抖音、笔趣阁、漫画小说下载、音乐电影下载等