gaomingqi
PhD student in Computer Vision and Deep Learning
University of Warwick | SUSTechShenzhen, China
gaomingqi's Stars
facebookresearch/sam2
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
MrNeRF/awesome-3D-gaussian-splatting
Curated list of papers and resources focused on 3D Gaussian Splatting, intended to keep pace with the anticipated surge of research in the coming months.
VAST-AI-Research/TripoSR
UX-Decoder/Semantic-SAM
[ECCV 2024] Official implementation of the paper "Semantic-SAM: Segment and Recognize Anything at Any Granularity"
dvlab-research/ControlNeXt
Controllable video and image Generation, SVD, Animate Anyone, ControlNet, ControlNeXt, LoRA
Stability-AI/stable-fast-3d
SF3D: Stable Fast 3D Mesh Reconstruction with UV-unwrapping and Illumination Disentanglement
DAMO-NLP-SG/VideoLLaMA2
VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs
HCPLab-SYSU/Embodied_AI_Paper_List
[Embodied-AI-Survey-2024] Paper list and projects for Embodied AI
mbzuai-oryx/groundingLMM
[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.
buaacyw/MeshAnythingV2
From anything to mesh like human artists. Official impl. of "MeshAnything V2: Artist-Created Mesh Generation With Adjacent Mesh Tokenization"
FoundationVision/Groma
[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization
pytorch-labs/attention-gym
Helpful tools and examples for working with flex-attention
Jyxarthur/flowsam
[ACCV 2024 (Oral)] Official Implementation of "Moving Object Segmentation: All You Need Is SAM (and Flow)" Junyu Xie, Charig Yang, Weidi Xie, Andrew Zisserman
Traffic-X/ViT-CoMer
Official implementation of the CVPR 2024 paper ViT-CoMer: Vision Transformer with Convolutional Multi-scale Feature Interaction for Dense Predictions.
weijielyu/Gaga
Gaga: Group Any Gaussians via 3D-aware Memory Bank
zamling/PSALM
[ECCV2024] This is an official implementation for "PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model"
cilinyan/VISA
[ECCV24] VISA: Reasoning Video Object Segmentation via Large Language Model
zhang-tao-whu/DVIS_Plus
shirowalker/UCAD
[AAAI-2024] Offical code for <Unsupervised Continual Anomaly Detection with Contrastively-learned Prompt>.
heshuting555/DsHmp
[CVPR-2024] Decoupling Static and Hierarchical Motion Perception for Referring Video Segmentation
XuHu0529/SAGS
The official implementation of SAGS (Segment Anything in 3D Gaussians)
htqin/BiMatting
[NeurIPS 2023] This project is the official implementation of our accepted NeurIPS 2023 paper BiMatting: Efficient Video Matting via Binarization.
Tapall-AI/MeViS_Track_Solution_2024
[CVPR 2024 Challenge] 1st Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation
jinlab-imvr/Surgical-SAM-2
ttgeng233/UniAV
Unified Audio-Visual Perception for Multi-Task Video Localization
Kki2Eve/RISNet
Depth-Aware Concealed Crop Detection in Dense Agricultural Scenes, CVPR 2024
zjr2000/REVERIE
[ECCV2024] Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models
yoqim/waveface
Repository for "WaveFace: Authentic Face Restoration with Efficient Frequency Recovery" (CVPR24)
yunlong10/MMComposition
Repo for MMComposition Benchmark
yoqim/waveface_page
project page for waveface