kakadeguaidao's Stars
meta-llama/llama-recipes
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama3 for WhatsApp & Messenger.
FunAudioLLM/CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
NaiboWang/EasySpider
A visual no-code/code-free web crawler/spider易采集:一个可视化浏览器自动化测试/数据采集/爬虫软件,可以无代码图形化的设计和执行爬虫任务。别名:ServiceWrapper面向Web应用的智能化服务封装系统。
descriptinc/audiotools
Object-oriented handling of audio data, with GPU-powered augmentations, and more.
microsoft/generative-ai-for-beginners
18 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/
Syllo/nvtop
GPU & Accelerator process monitoring for AMD, Apple, Huawei, Intel, NVIDIA and Qualcomm
modelscope/FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
amazon-science/chronos-forecasting
Chronos: Pretrained (Language) Models for Probabilistic Time Series Forecasting
lukas-blecher/LaTeX-OCR
pix2tex: Using a ViT to convert images of equations into LaTeX code.
ermongroup/cs228-notes
Course notes for CS228: Probabilistic Graphical Models.
heawon-yoon/anim-gaussian
Animatable Gaussian textured Avatar
mozillazg/pypinyin-g2pW
基于 g2pW 提升 pypinyin 的准确性
practical-tutorials/project-based-learning
Curated list of project-based tutorials
google-research/timesfm
TimesFM (Time Series Foundation Model) is a pretrained time-series foundation model developed by Google Research for time-series forecasting.
MasayaKawamura/MB-iSTFT-VITS
Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform
open-mmlab/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
fishaudio/Bert-VITS2
vits2 backbone with multilingual-bert
ritheshkumar95/pytorch-vqvae
Vector Quantized VAEs - PyTorch Implementation
wenet-e2e/WeTextProcessing
Text Normalization & Inverse Text Normalization
Skylark0924/Machine-Learning-is-ALL-You-Need
🔥🌟《Machine Learning 格物志》: ML + DL + RL basic codes and notes by sklearn, PyTorch, TensorFlow, Keras & the most important, from scratch!💪 This repository is ALL You Need!
ise-uiuc/magicoder
Magicoder: Source Code Is All You Need
huggingface/parler-tts
Inference and training library for high-quality TTS models.
dunky11/voicesmith
[WIP] VoiceSmith makes training text to speech models easy.
wwyuan2023/TextParser
TTS Chinese and English text analysis
archinetai/audio-ai-timeline
A timeline of the latest AI models for audio generation, starting in 2023!
TheAlgorithms/C-Plus-Plus
Collection of various algorithms in mathematics, machine learning, computer science and physics implemented in C++ for educational purposes.
lucidrains/vector-quantize-pytorch
Vector (and Scalar) Quantization, in Pytorch
wenet-e2e/speech-synthesis-paper
List of speech synthesis papers.
PlayVoice/whisper-vits-svc
Core Engine of Singing Voice Conversion & Singing Voice Clone
vinsis/understanding-neuralnetworks-pytorch
Understanding nuts and bolts of neural networks with PyTorch