Pinned Repositories
Embodied_AI_Paper_List
[Embodied-AI-Survey-2024] Paper list and projects for Embodied AI
Image-Retinex
Image Enhancement
MM-2021
[ACM Multimedia 2021] Spatiotemporal Inconsistency Learning for DeepFake Video Detection
multi-task-learning-example-PyTorch
SELFY
Official PyTorch Implementation of Learning Self-Similarity in Space and Time as Generalized Motion for Video Action Recognition, ICCV 2021
SimAM
The official pytorch implemention of our ICML paper "SimAM: A Simple, Parameter-Free Attention Module for Convolutional Neural Networks".
STEGO
Unsupervised Semantic Segmentation by Distilling Feature Correspondences
TPN
[CVPR 2020] Temporal Pyramid Network for Action Recognition
uncertainties
AM207 project: dissect aleatoric and epistemic uncertainty
VCP
Unofficial implement of "Video cloze procedure for self-supervised spatio-temporal learning" [AAAI20]
Holmes-GU's Repositories
Holmes-GU/VidMan
Codes of ``VidMan: Exploiting Intrinsic Dynamics from Video Diffusion Model for Effective Robot Manipulation"
Holmes-GU/arp
Autoregressive Policy for Robot Learning
Holmes-GU/CLOVER
[NeurIPS 2024] CLOVER: Closed-Loop Visuomotor Control with Generative Expectation for Robotic Manipulation
Holmes-GU/copa
Official implementation of CoPa: General Robotic Manipulation through Spatial Constraints of Parts with Foundation Models
Holmes-GU/DeeR-VLA
Official code of paper "DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution"
Holmes-GU/dynamo_ssl
DynaMo: In-Domain Dynamics Pretraining for Visuo-Motor Control
Holmes-GU/Emu
Emu Series: Generative Multimodal Models from BAAI
Holmes-GU/Emu3
Next-Token Prediction is All You Need
Holmes-GU/GPT4Video
Offical Code for GPT4Video: A Unified Multimodal Large Language Model for lnstruction-Followed Understanding and Safety-Aware Generation
Holmes-GU/Guzhihao.cv
Personal resume.
Holmes-GU/hiveformer
Holmes-GU/HPT
Heterogeneous Pre-trained Transformer (HPT) is a scalable policy learner for robotics.
Holmes-GU/Keystate_Online_Imitation
The repo for "KOI: Accelerating Online Imitation Learning via Hybrid Key-state Guidance", CoRL 2024
Holmes-GU/language-table
Suite of human-collected datasets and a multi-task continuous control benchmark for open vocabulary visuolinguomotor learning.
Holmes-GU/LAPA
Holmes-GU/LLARVA
Holmes-GU/MaxMI
A Maximal Mutual Information Criterion for Manipulation Concept Discovery
Holmes-GU/MoDE_Diffusion_Policy
Code for "Efficient Diffusion Transformer Policies with Mixture of Expert Denoisers for Multitask Learning"
Holmes-GU/MS-Bot
The offical repo for "Play to the Score: Stage-Guided Dynamic Multi-Sensory Fusion for Robotic Manipulation", CoRL 2024 (ORAL)
Holmes-GU/openr
OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models
Holmes-GU/PALI3
Implementation of PALI3 from the paper PALI-3 VISION LANGUAGE MODELS: SMALLER, FASTER, STRONGER"
Holmes-GU/PIVOT-R
Holmes-GU/RACER
RACER: Rich Language-Guided Failure Recovery Policies for Imitation Learning
Holmes-GU/Robo_MUTUAL
The official implementation of "Robo-MUTUAL: Robotic Multimodal Task Specification via Unimodal Learning"
Holmes-GU/RT-X
Pytorch implementation of the models RT-1-X and RT-2-X from the paper: "Open X-Embodiment: Robotic Learning Datasets and RT-X Models"
Holmes-GU/SDP
SDP
Holmes-GU/skill_transfer
Semantic-Geometric-Physical-Driven Robot Manipulation Skill Transfer via Skill Library and Tactile Representation
Holmes-GU/Thinking-Claude
Let your Claude able to think
Holmes-GU/transfusion-pytorch
Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI
Holmes-GU/VPDD