zwong91's Stars
huggingface/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
comfyanonymous/ComfyUI
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
keras-team/keras
Deep Learning for humans
RVC-Boss/GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
gradio-app/gradio
Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
myshell-ai/OpenVoice
Instant voice cloning by MIT and MyShell.
huggingface/diffusers
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.
qdrant/qdrant
Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/
unslothai/unsloth
Finetune Llama 3.3, Mistral, Phi, Qwen 2.5 & Gemma LLMs 2-5x faster with 70% less memory
sczhou/CodeFormer
[NeurIPS 2022] Towards Robust Blind Face Restoration with Codebook Lookup Transformer
meta-llama/llama-recipes
Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama for WhatsApp & Messenger.
KwaiVGI/LivePortrait
Bring portraits to life!
cvat-ai/cvat
Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.
QwenLM/Qwen2.5
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
FunAudioLLM/CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Vaibhavs10/insanely-fast-whisper
modelscope/FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
Ucas-HaoranWei/GOT-OCR2.0
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
THUDM/GLM-4
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
apple/coremltools
Core ML tools contain supporting tools for Core ML model conversion, editing, and validation.
FunAudioLLM/SenseVoice
Multilingual Voice Understanding Model
Snowiiii/Pumpkin
Empowering everyone to host fast and efficient Minecraft servers.
Stability-AI/sd3.5
langgenius/dify-sandbox
A lightweight, fast, and secure code execution environment that supports multiple programming languages
thushv89/attention_keras
Keras Layer implementation of Attention for Sequential models
ultralytics/google-images-download
Google/Bing Images Web Downloader
BorisPolonsky/dify-helm
Deploy langgenious/dify, an LLM based app on kubernetes with helm chart
langgenius/webapp-text-generator
tyrchen/ava-bot
A simple llm bot that act as an assistant