sean-xr's Stars
WalBouss/LeGrad
deepglint/ALIP
[ICCV 2023] ALIP: Adaptive Language-Image Pre-training with Synthetic Caption
hammoudhasan/SynthCLIP
Code base of SynthCLIP: CLIP training with purely synthetic text-image pairs from LLMs and TTIs.
fwalch/tum-thesis-latex
:notebook_with_decorative_cover: A LaTeX template for TUM Bachelor/Master theses.
hila-chefer/Transformer-MM-Explainability
[ICCV 2021- Oral] Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.
hila-chefer/Transformer-Explainability
[CVPR 2021] Official PyTorch implementation for Transformer Interpretability Beyond Attention Visualization, a novel method to visualize classifications by Transformer based networks.
lucidrains/transfusion-pytorch
Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI
huggingface/pytorch-image-models
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
facebookresearch/sam2
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
ShiArthur03/ShiArthur03
OpenGVLab/InternVL
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
webdataset/webdataset
A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.
ylaxor/clip-like
Train (fine-tune) OpenAI's CLIP-like models on custom image-caption data sets, cf. COCO dataset. PyTorch implementation.
facebookresearch/ImageBind
ImageBind One Embedding Space to Bind Them All
facebookresearch/SLIP
Code release for SLIP Self-supervision meets Language-Image Pre-training
yuweihao/MambaOut
MambaOut: Do We Really Need Mamba for Vision?
mlfoundations/open_clip
An open source implementation of CLIP.
dongliangcao/Unsupervised-Learning-of-Robust-Spectral-Shape-Matching
SIGGRAPH23: Unsupervised Learning of Robust Spectral Shape Matching
Pointcept/GPT4Point
[CVPR'24 Highlight] GPT4Point: A Unified Framework for Point-Language Understanding and Generation.
NVIDIAGameWorks/kaolin
A PyTorch Library for Accelerating 3D Deep Learning Research
niladridutt/Diffusion-3D-Features
Diffusion 3D Features (Diff3F): Decorating Untextured Shapes with Distilled Semantic Features [CVPR 2024]
developer0hye/PyTorch-Deformable-Convolution-v2
Don't feel pain to use Deformable Convolution
OutofAi/2D-Gaussian-Splatting
A 2D Gaussian Splatting paper for no obvious reasons. Enjoy!
tsunghan-wu/SLD
🔥 [CVPR2024] Official implementation of "Self-correcting LLM-controlled Diffusion Models (SLD)
wimmerth/back-to-3d-few-shot-keypoints
Repository of the CVPR paper "Back to 3D: Few-Shot 3D Keypoint Detection with Back-Projected Features".
gviga/AB-ZoomOut
szacho/pointcam
Self-supervised adversarial masking for point clouds
muse1998/Source-Free-Domain-Generalization
An open-world scenario domain generalization code base
nstucki/Betti-matching
Colin97/OpenShape_code
official code of “OpenShape: Scaling Up 3D Shape Representation Towards Open-World Understanding”