XiangBaoSong

XiangBaoSong's Stars

meta-llama/llama
Inference code for Llama models
Language:Python57.6k 530 1.1k9.7k
facebookresearch/segment-anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Language:Jupyter Notebook48.9k 315 6865.8k
zhayujie/chatgpt-on-wechat
基于大模型搭建的聊天机器人，同时支持微信公众号、企业微信应用、飞书、钉钉等接入，可选择GPT3.5/GPT-4o/GPT-o1/ DeepSeek/Claude/文心一言/讯飞星火/通义千问/ Gemini/GLM-4/Claude/Kimi/LinkAI，能处理文本、语音和图片，访问操作系统和互联网，支持基于自有知识库进行定制企业智能客服。
Language:Python34.7k 258 1.9k8.8k
hpcaitech/Open-Sora
Open-Sora: Democratizing Efficient Video Production for All
Language:Python23.3k 193 5382.3k
HKUDS/LightRAG
"LightRAG: Simple and Fast Retrieval-Augmented Generation"
Language:Python11.9k 106 5081.7k
huggingface/lerobot
🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
Language:Python9.1k 91 202971
Tencent/HunyuanVideo
HunyuanVideo: A Systematic Framework For Large Video Generation Model
Language:Python8.5k 109 187690
jzhang38/TinyLlama
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
Language:Python8.2k 110 159505
LargeWorldModel/LWM
Large World Model -- Modeling Text and Video with Millions Context
Language:Python7.2k 66 72556
gaomingqi/Track-Anything
Track-Anything is a flexible and interactive tool for video object tracking and segmentation, based on Segment Anything, XMem, and E2FGVI.
Language:Python6.6k 62 140486
QwenLM/Qwen-VL
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
Language:Python5.5k 49 462418
showlab/Tune-A-Video
[ICCV 2023] Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
Language:Python4.3k 49 97388
ShoufaChen/DiffusionDet
[ICCV2023 Best Paper Finalist] PyTorch implementation of DiffusionDet (https://arxiv.org/abs/2211.09788)
Language:Python2.1k 17 115163
hkchengrex/XMem
[ECCV 2022] XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model
Language:Python1.8k 20 137197
google-deepmind/open_x_embodiment
Language:Jupyter Notebook1k 22 8072
facebookresearch/home-robot
Mobile manipulation research tools for roboticists
Language:Python995 31 170134
rhymes-ai/Aria
Codebase for Aria - an Open Multimodal Native MoE
Language:Jupyter Notebook993 20 4783
robodhruv/visualnav-transformer
Official code and checkpoint release for mobile robot foundation models: GNM, ViNT, and NoMaD.
Language:Python719 34 5591
vlmaps/vlmaps
[ICRA2023] Implementation of Visual Language Maps for Robot Navigation
Language:Python430 11 6555
leostrong8/openai-fill-billing
openai 充值指南
284 2 315
yfeng95/SCARF
Language:Python256 16 1714
ir413/mvp
Masked Visual Pre-training for Robotics
Language:Python228 6 1726
bcmi/SLBR-Visible-Watermark-Removal
[ACM MM 2021] Visible Watermark Removal via Self-calibrated Localization and Background Refinement
Language:Python226 10 4136
OrigamiDream/gato
Unofficial Gato: A Generalist Agent
Language:Python210 14 430
heyuanYao-pku/Control-VAE
Language:C++165 8 1314
benquick123/C-VTON
C-VTON: Context-Driven Image-Based Virtual Try-On Network
Language:Python150 5 1932
siddhanthaldar/BAKU
Code for BAKU: An Efficient Transformer for Multi-Task Policy Learning
Language:Python83 3 129
ManifoldRG/NEKO
In Progress Implementation of GATO style Generalist Multimodal model capable of image, text, RL and Robotics tasks
Language:Python48 5 6311
YushuoLi/Gato-A-Generalist-Agent
Minimal code for A Generalist Agent
Language:Python38 2 16
sfd158/SimAndViewCharacter
3 1 01