kakadeguaidao

kakadeguaidao's Stars

meta-llama/llama-recipes
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama3 for WhatsApp & Messenger.
Language:Jupyter Notebook11k1.6k
FunAudioLLM/CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Language:Python3k284
NaiboWang/EasySpider
A visual no-code/code-free web crawler/spider易采集：一个可视化浏览器自动化测试/数据采集/爬虫软件，可以无代码图形化的设计和执行爬虫任务。别名：ServiceWrapper面向Web应用的智能化服务封装系统。
Language:JavaScript30.8k3.7k
descriptinc/audiotools
Object-oriented handling of audio data, with GPU-powered augmentations, and more.
Language:Python20536
microsoft/generative-ai-for-beginners
18 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/
Language:Jupyter Notebook57.7k29.8k
Syllo/nvtop
GPU & Accelerator process monitoring for AMD, Apple, Huawei, Intel, NVIDIA and Qualcomm
Language:C7.8k288
modelscope/FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
Language:Python5.2k561
amazon-science/chronos-forecasting
Chronos: Pretrained (Language) Models for Probabilistic Time Series Forecasting
Language:Python2.2k253
lukas-blecher/LaTeX-OCR
pix2tex: Using a ViT to convert images of equations into LaTeX code.
Language:Python11.5k956
ermongroup/cs228-notes
Course notes for CS228: Probabilistic Graphical Models.
Language:SCSS1.9k467
heawon-yoon/anim-gaussian
Animatable Gaussian textured Avatar
Language:Python311
mozillazg/pypinyin-g2pW
基于 g2pW 提升 pypinyin 的准确性
Language:Python707
practical-tutorials/project-based-learning
Curated list of project-based tutorials
188k24.6k
google-research/timesfm
TimesFM (Time Series Foundation Model) is a pretrained time-series foundation model developed by Google Research for time-series forecasting.
Language:Python3.3k263
MasayaKawamura/MB-iSTFT-VITS
Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform
Language:Python40465
open-mmlab/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
Language:Python4.3k370
fishaudio/Bert-VITS2
vits2 backbone with multilingual-bert
Language:Python7.6k1.1k
ritheshkumar95/pytorch-vqvae
Vector Quantized VAEs - PyTorch Implementation
Language:Python812133
wenet-e2e/WeTextProcessing
Text Normalization & Inverse Text Normalization
Language:Python43164
Skylark0924/Machine-Learning-is-ALL-You-Need
🔥🌟《Machine Learning 格物志》: ML + DL + RL basic codes and notes by sklearn, PyTorch, TensorFlow, Keras & the most important, from scratch!💪 This repository is ALL You Need!
Language:Python38289
ise-uiuc/magicoder
Magicoder: Source Code Is All You Need
Language:Python1.9k167
huggingface/parler-tts
Inference and training library for high-quality TTS models.
Language:Python2.9k300
dunky11/voicesmith
[WIP] VoiceSmith makes training text to speech models easy.
Language:Python21732
wwyuan2023/TextParser
TTS Chinese and English text analysis
Language:Python102
archinetai/audio-ai-timeline
A timeline of the latest AI models for audio generation, starting in 2023!
1.9k67
TheAlgorithms/C-Plus-Plus
Collection of various algorithms in mathematics, machine learning, computer science and physics implemented in C++ for educational purposes.
Language:C++29.7k7k
lucidrains/vector-quantize-pytorch
Vector (and Scalar) Quantization, in Pytorch
Language:Python2.2k183
wenet-e2e/speech-synthesis-paper
List of speech synthesis papers.
977120
PlayVoice/whisper-vits-svc
Core Engine of Singing Voice Conversion & Singing Voice Clone
Language:Python2.6k916
vinsis/understanding-neuralnetworks-pytorch
Understanding nuts and bolts of neural networks with PyTorch
333