Pinned Repositories
360monodepth
Code release for 360monodepth. With our framework we achieve monocular depth estimation for high resolution 360° images based on aligning and blending perspective depth maps.
3DitScene
3DitScene: Editing Any Scene via Language-guided Disentangled Gaussian Splatting
academic
ACVNet
[CVPR 2022] ACVNet: Attention Concatenation Volume for Accurate and Efficient Stereo Matching
AnyDoor
Official implementations for paper: Anydoor: zero-shot object-level image customization
attention-mask-control
code for paper "Compositional Text-to-Image Synthesis with Attention Map Control of Diffusion Models"
light-weight-face-anti-spoofing
towards the solving spoofing problem
resume
拾迹/张大侠个人简历模板修改版
starter-hugo-academic
VPD
[ICCV 2023] VPD is a framework that leverages the high-level and low-level knowledge of a pre-trained text-to-image diffusion model to downstream visual perception tasks.
Steven-Xiong's Repositories
Steven-Xiong/360monodepth
Code release for 360monodepth. With our framework we achieve monocular depth estimation for high resolution 360° images based on aligning and blending perspective depth maps.
Steven-Xiong/3DitScene
3DitScene: Editing Any Scene via Language-guided Disentangled Gaussian Splatting
Steven-Xiong/AnyDoor
Official implementations for paper: Anydoor: zero-shot object-level image customization
Steven-Xiong/Awesome-Controllable-Generation
Papers and resources on Controllable Generation using Diffusion Models, including ControlNet, DreamBooth, T2I-Adapter, IP-Adapter.
Steven-Xiong/CustomNet
Steven-Xiong/Depth-Anything
[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation
Steven-Xiong/Depth-Anything-V2
Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
Steven-Xiong/dust3r
DUSt3R: Geometric 3D Vision Made Easy
Steven-Xiong/EGformer
Steven-Xiong/gaussian-splatting
Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"
Steven-Xiong/GaussianObject
Code for "GaussainObject: Just Taking Four Images to Get A High-Quality 3D Object with Gaussian Splatting"
Steven-Xiong/GLIGEN
Open-Set Grounded Text-to-Image Generation
Steven-Xiong/GroundingDINO
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
Steven-Xiong/Infusion
Official implementations for paper: InFusion: Inpainting 3D Gaussians via Learning Depth Completion from Diffusion Prior
Steven-Xiong/InstanceDiffusion
Pytorch implementation for "InstanceDiffusion: Instance-level Control for Image Generation"
Steven-Xiong/InstantID
InstantID : Zero-shot Identity-Preserving Generation in Seconds 🔥
Steven-Xiong/invisible-stitch
Invisible Stitch: Generating Smooth 3D Scenes with Depth Inpainting
Steven-Xiong/lambda-eclipse-inference
Official PyTorch implementation of "λ-ECLIPSE: Multi-Concept Personalized Text-to-Image Diffusion Models by Leveraging CLIP Latent Space"
Steven-Xiong/lang-segment-anything
SAM with text prompt
Steven-Xiong/llama-recipes
Examples and recipes for Llama 2 model
Steven-Xiong/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Steven-Xiong/LucidDreamer
Official code for the paper "LucidDreamer: Domain-free Generation of 3D Gaussian Splatting Scenes".
Steven-Xiong/MiniGPT-4
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
Steven-Xiong/MMPano
Steven-Xiong/MoMA
MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation
Steven-Xiong/segment-anything-2
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Steven-Xiong/stable-diffusion
A latent text-to-image diffusion model
Steven-Xiong/Steven-Xiong.github.io
Steven-Xiong/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Steven-Xiong/zero123plus
Code repository for Zero123++: a Single Image to Consistent Multi-view Diffusion Base Model.