yifeisu

yifeisu's Stars

OpenDriveLab/AgiBot-World
World's First Large-scale High-quality Robotic Manipulation Benchmark
Language:Python1.1k76
Robot-VLAs/RoboVLMs
Language:Python2206
CleanDiffuserTeam/CleanDiffuser
CleanDiffuser: An Easy-to-use Modularized Library for Diffusion Models in Decision Making
Language:Jupyter Notebook46339
RchalYang/harmonic_mobile_manipulation
Language:Python6
microsoft/CogACT
A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation
Language:Python1367
ARISE-Initiative/robosuite
robosuite: A Modular Simulation Framework and Benchmark for Robot Learning
Language:Python1.4k446
agilexrobotics/RoboTwin
RoboTwin: Dual-Arm Robot Benchmark with Generative Digital Twins
8
TencentARC/Moto
Latent Motion Token as the Bridging Language for Robot Manipulation
Language:Python621
ChaofanTao/Autoregressive-Models-in-Vision-Survey
The paper collections for the autoregressive models in vision.
35412
wenqsun/DimensionX
DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion
Language:Python1.1k66
OpenGVLab/InternVL
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
Language:Python6.7k521
donydchen/mvsplat360
🎞️ [NeurIPS'24] MVSplat360: Feed-Forward 360 Scene Synthesis from Sparse Views
Language:Python1996
microsoft/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Language:Python36.2k4.2k
OpenGVLab/Ask-Anything
[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
Language:Python3.1k256
NVlabs/VILA
VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.
Language:Python2.7k214
tdurieux/anonymous_github
Anonymous Github is a proxy server to support anonymous browsing of Github repositories for open-science code and data.
Language:TypeScript1.5k58
cshizhe/HM3DAutoVLN
Official implementation of Learning from Unlabeled 3D Environments for Vision-and-Language Navigation (ECCV'22).
Language:Python382
Genesis-Embodied-AI/RoboGen
A generative and self-guided robotic agent that endlessly propose and master new skills.
Language:Python84676
VinAIResearch/Open3DIS
Open3DIS: Open-vocabulary 3D Instance Segmentation with 2D Mask Guidance (CVPR 2024)
Language:Python863
C-water/SDPL_release
SDPL: Shifting-Dense Partition Learning for UAV-view Geo-localization
Language:Python12
zju3dv/DiffPano
[NeurIPS2024] DiffPano: Scalable and Consistent Text to Panorama Generation with Spherical Epipolar-Aware Diffusion
25
frank-xwang/UnSAM
[NeurIPS 2024] Code release for "Segment Anything without Supervision"
Language:Jupyter Notebook43928
HKUST-Aerial-Robotics/VINS-Mono
A Robust and Versatile Monocular Visual-Inertial State Estimator
Language:C++5.1k2.1k
haosulab/ManiSkill
SAPIEN Manipulation Skill Framework, a open source GPU parallelized robotics simulator and benchmark, led by Hillbot, Inc.
Language:Python1.1k197
tonyzhaozh/act
Language:Python867200
chernyadev/bigym
Demo-Driven Mobile Bi-Manual Manipulation Benchmark.
Language:Python13118
carlosferrazza/humanoid-bench
Language:Python42854
WhoKnowsssss/skill_transformer
Language:Python294
deepseek-ai/Janus
Janus-Series: Unified Multimodal Understanding and Generation Models
Language:Python1.3k67
thu-ml/RoboticsDiffusionTransformer
RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation
Language:Python78371