wz0919's Stars
OpenGVLab/InternVL
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
HVision-NKU/StoryDiffusion
Accepted as [NeurIPS 2024] Spotlight Presentation Paper
lizhe00/AnimatableGaussians
Code of [CVPR 2024] "Animatable Gaussians: Learning Pose-dependent Gaussian Maps for High-fidelity Human Avatar Modeling"
OpenGVLab/VideoMamba
[ECCV2024] VideoMamba: State Space Model for Efficient Video Understanding
hitcslj/Awesome-AIGC-3D
A curated list of awesome AIGC 3D papers
OpenRobotLab/EmbodiedScan
[CVPR 2024 & NeurIPS 2024] EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI
concept-graphs/concept-graphs
Official code release for ConceptGraphs
vlmaps/vlmaps
[ICRA2023] Implementation of Visual Language Maps for Robot Navigation
Vchitect/Vlogger
[CVPR2024] Make Your Dream A Vlog
HL-hanlin/Ctrl-Adapter
Official implementation of Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model (ICLR 2025 Oral)
OpenGVLab/InternVideo2
google-deepmind/perception_test
zd11024/NaviLLM
[CVPR 2024] The code for paper 'Towards Learning a Generalist Model for Embodied Navigation'
GengzeZhou/NavGPT-2
[ECCV 2024] Official implementation of NavGPT-2: Unleashing Navigational Reasoning Capability for Large Vision-Language Models
UMass-Foundation-Model/MultiPLY
Code for MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D World
DefaultRui/VLN-VER
[CVPR24] Volumetric Environment Representation for Vision-Language Navigation
MrZihan/GridMM
Official implementation of GridMM: Grid Memory Map for Vision-and-Language Navigation (ICCV'23).
CrystalSixone/VLN-GOAT
Repository for Vision-and-Language Navigation via Causal Learning (Accepted by CVPR 2024)
MrZihan/HNR-VLN
Official implementation of Lookahead Exploration with Neural Radiance Representation for Continuous Vision-Language Navigation (CVPR'24 Highlight).
OpenGVLab/EgoExoLearn
[CVPR 2024] Data and benchmark code for the EgoExoLearn dataset
JeremyLinky/YouTube-VLN
[ICCV'23] Learning Vision-and-Language Navigation from YouTube Videos
MrZihan/Sim2Real-VLN-3DFF
Official implementation of Sim-to-Real Transfer via 3D Feature Fields for Vision-and-Language Navigation (CoRL'24).
jaehong31/RACCooN
(arXiv.2405.18406) RACCooN: A Versatile Instructional Video Editing Framework with Auto-Generated Narratives
jialuli-luka/SELMA
Code and Data for Paper: SELMA: Learning and Merging Skill-Specific Text-to-Image Experts with Auto-Generated Data
OpenRobotLab/OVExp
OVExp: Open Vocabulary Exploration for Object-Oriented Navigation
vlc-robot/polarnet
[CoRL2023] Official PyTorch implementation of PolarNet: 3D Point Clouds for Language-Guided Robotic Manipulation
CrystalSixone/DSRG
Code for A Dual Semantic-Aware Recurrent Global-Adaptive Network For Vision-and-Language Navigation
iSEE-Laboratory/VLN-PRET
CrystalSixone/VLN-MAGIC
This is the official repository for MAGIC: Meta-Ability Guided Interactive Chain-of-Distillation Learning towards Efficient Vision-and-Language Navigation
Zhangzeyu97/CBD
Code for Strong and Controllable Blind Image Decomposition