Pinned Repositories
cloudgripper-push-1k
Code and dataset for "How Physics and Background Attributes Impact Video Transformers in Robotic Manipulation: A Case Study on Pushing" (IROS 2024)
Awesome-Diffusion-Models
A collection of resources and papers on Diffusion Models
Awesome-Video-Diffusion
A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.
Deformable-DETR
Deformable DETR: Deformable Transformers for End-to-End Object Detection.
diffusion_policy
[RSS 2023] Diffusion Policy Visuomotor Policy Learning via Action Diffusion
dinov2
PyTorch code and models for the DINOv2 self-supervised learning method.
FateZero
[ICCV 2023 Oral] "FateZero: Fusing Attentions for Zero-shot Text-based Video Editing"
mujoco_menagerie
A collection of high-quality models for the MuJoCo physics engine, curated by DeepMind.
peract
Perceiver-Actor: A Multi-Task Transformer for Robotic Manipulation
segment-anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
ShutongJIN's Repositories
ShutongJIN/Awesome-Diffusion-Models
A collection of resources and papers on Diffusion Models
ShutongJIN/Awesome-Video-Diffusion
A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.
ShutongJIN/diffusion_policy
[RSS 2023] Diffusion Policy Visuomotor Policy Learning via Action Diffusion
ShutongJIN/FateZero
[ICCV 2023 Oral] "FateZero: Fusing Attentions for Zero-shot Text-based Video Editing"
ShutongJIN/shutongjin.github.io
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
ShutongJIN/AVDC
Official repository of Learning to Act from Actionless Video through Dense Correspondences.
ShutongJIN/AVDC_experiments
The official codebase for running the experiments described in the AVDC paper.
ShutongJIN/Awesome-LLM-Robotics
A comprehensive list of papers using large language/multi-modal models for Robotics/RL, including papers, codes, and related websites
ShutongJIN/BLIP
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
ShutongJIN/CLIP
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
ShutongJIN/CLIP_prefix_caption
Simple image captioning model
ShutongJIN/cliport
CLIPort: What and Where Pathways for Robotic Manipulation
ShutongJIN/ControlNet
Let us control diffusion models!
ShutongJIN/ControlVideo
[ICLR 2024] Official pytorch implementation of "ControlVideo: Training-free Controllable Text-to-Video Generation"
ShutongJIN/data4robotics
ShutongJIN/dino
PyTorch code for Vision Transformers training with the Self-Supervised learning method DINO
ShutongJIN/DPL
Dynamic Prompt Learning: Addressing Cross-Attention Leakage for Text-Based Image Editing (NeurIPS 2023)
ShutongJIN/drrobot
Code for "Differentiable Robot Rendering" (CoRL 2024)
ShutongJIN/FedCADO
The implementation of FedCADO (Classifier-Assisted Diffusion for One-shot Federated learning method)
ShutongJIN/IC-Light
More relighting!
ShutongJIN/mae
PyTorch implementation of MAE https//arxiv.org/abs/2111.06377
ShutongJIN/moco
PyTorch implementation of MoCo: https://arxiv.org/abs/1911.05722
ShutongJIN/moco-v3
PyTorch implementation of MoCo v3 https//arxiv.org/abs/2104.02057
ShutongJIN/pytorch-grad-cam
Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.
ShutongJIN/StereoDiffusion
Implementation of StereoDiffusion
ShutongJIN/StructDiffusion
StructDiffusion: Language-Guided Creation of Physically-Valid Structures using Unseen Objects
ShutongJIN/susie
Code for subgoal synthesis via image editing
ShutongJIN/UnseenObjectClustering
Learning RGB-D Feature Embeddings for Unseen Object Instance Segmentation
ShutongJIN/VOT
ShutongJIN/ZeroNVS