yifeisu's Stars
OpenDriveLab/AgiBot-World
World's First Large-scale High-quality Robotic Manipulation Benchmark
Robot-VLAs/RoboVLMs
CleanDiffuserTeam/CleanDiffuser
CleanDiffuser: An Easy-to-use Modularized Library for Diffusion Models in Decision Making
RchalYang/harmonic_mobile_manipulation
microsoft/CogACT
A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation
ARISE-Initiative/robosuite
robosuite: A Modular Simulation Framework and Benchmark for Robot Learning
agilexrobotics/RoboTwin
RoboTwin: Dual-Arm Robot Benchmark with Generative Digital Twins
TencentARC/Moto
Latent Motion Token as the Bridging Language for Robot Manipulation
ChaofanTao/Autoregressive-Models-in-Vision-Survey
The paper collections for the autoregressive models in vision.
wenqsun/DimensionX
DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion
OpenGVLab/InternVL
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
donydchen/mvsplat360
🎞️ [NeurIPS'24] MVSplat360: Feed-Forward 360 Scene Synthesis from Sparse Views
microsoft/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
OpenGVLab/Ask-Anything
[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
NVlabs/VILA
VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.
tdurieux/anonymous_github
Anonymous Github is a proxy server to support anonymous browsing of Github repositories for open-science code and data.
cshizhe/HM3DAutoVLN
Official implementation of Learning from Unlabeled 3D Environments for Vision-and-Language Navigation (ECCV'22).
Genesis-Embodied-AI/RoboGen
A generative and self-guided robotic agent that endlessly propose and master new skills.
VinAIResearch/Open3DIS
Open3DIS: Open-vocabulary 3D Instance Segmentation with 2D Mask Guidance (CVPR 2024)
C-water/SDPL_release
SDPL: Shifting-Dense Partition Learning for UAV-view Geo-localization
zju3dv/DiffPano
[NeurIPS2024] DiffPano: Scalable and Consistent Text to Panorama Generation with Spherical Epipolar-Aware Diffusion
frank-xwang/UnSAM
[NeurIPS 2024] Code release for "Segment Anything without Supervision"
HKUST-Aerial-Robotics/VINS-Mono
A Robust and Versatile Monocular Visual-Inertial State Estimator
haosulab/ManiSkill
SAPIEN Manipulation Skill Framework, a open source GPU parallelized robotics simulator and benchmark, led by Hillbot, Inc.
tonyzhaozh/act
chernyadev/bigym
Demo-Driven Mobile Bi-Manual Manipulation Benchmark.
carlosferrazza/humanoid-bench
WhoKnowsssss/skill_transformer
deepseek-ai/Janus
Janus-Series: Unified Multimodal Understanding and Generation Models
thu-ml/RoboticsDiffusionTransformer
RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation