JeffSoong's Stars
RVC-Boss/GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
modelscope/facechain
FaceChain is a deep-learning toolchain for generating your Digital-Twin.
LargeWorldModel/LWM
OpenBMB/MiniCPM
MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.
QwenLM/Qwen-VL
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
allenai/OLMo
Modeling, training, eval, and inference code for OLMo
jamesmh/coravel
Near-zero config .NET library that makes advanced application features like Task Scheduling, Caching, Queuing, Event Broadcasting, and more a breeze!
cocopon/tweakpane
:control_knobs: Compact GUI for fine-tuning parameters and monitoring value changes
QwenLM/Qwen-Agent
Agent framework and applications built upon Qwen>=2.0, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.
hustvl/Vim
[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
X-PLUG/MobileAgent
Mobile-Agent: The Powerful Mobile Device Operation Assistant Family
modelscope/swift
ms-swift: Use PEFT or Full-parameter to finetune 300+ LLMs or 50+ MLLMs. (Qwen2, GLM4v, Internlm2.5, Yi, Llama3.1, Llava-Video, Internvl2, MiniCPM-V-2.6, Deepseek, Baichuan2, Gemma2, Phi3-Vision, ...)
modelscope/modelscope-agent
ModelScope-Agent: An agent framework connecting models in ModelScope with the world
PKU-YuanGroup/MoE-LLaVA
Mixture-of-Experts for Large Vision-Language Models
QwenLM/Qwen-Audio
The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.
NVlabs/FoundationPose
[CVPR 2024 Highlight] FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects
apple/ml-neuman
Official repository of NeuMan: Neural Human Radiance Field from a Single Video (ECCV 2022)
yyyujintang/Awesome-Mamba-Papers
Awesome Papers related to Mamba.
deepseek-ai/DeepSeek-MoE
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
lizhe00/AnimatableGaussians
Code of [CVPR 2024] "Animatable Gaussians: Learning Pose-dependent Gaussian Maps for High-fidelity Human Avatar Modeling"
OpenGVLab/VideoMamba
[ECCV2024] VideoMamba: State Space Model for Efficient Video Understanding
roboterax/humanoid-gym
Humanoid-Gym: Reinforcement Learning for Humanoid Robot with Zero-Shot Sim2Real Transfer https://arxiv.org/abs/2404.05695
ShenhanQian/GaussianAvatars
[CVPR 2024 Highlight] The official repo for "GaussianAvatars: Photorealistic Head Avatars with Rigged 3D Gaussians"
nomic-ai/contrastors
Train Models Contrastively in Pytorch
OpenGVLab/Vision-RWKV
Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures
niuzaisheng/ScreenAgent
ScreenAgent: A Computer Control Agent Driven by Visual Language Large Model (IJCAI-24)
JeffWang987/WorldDreamer
WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens
NationalGAILab/HoT
[CVPR 2024 🔥] Official implementation of the paper "⏳ Hourglass Tokenizer for Efficient Transformer-Based 3D Human Pose Estimation"
ZhengyiLuo/SMPLSim
Simulating SMPL humanoid, supporting PHC/PHC-MJX/PULSE/SimXR code bases.
OpenGVLab/Hulk
An official implementation of "Hulk: A Universal Knowledge Translator for Human-Centric Tasks"