Adrien987k's Stars
facebookresearch/xformers
Hackable and optimized Transformers building blocks, supporting a composable construction.
stack-of-tasks/pinocchio
A fast and flexible implementation of Rigid Body Dynamics algorithms and their analytical derivatives
PKU-YuanGroup/Video-LLaVA
【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
DAMO-NLP-SG/Video-LLaMA
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
OpenGVLab/VideoMamba
[ECCV2024] VideoMamba: State Space Model for Efficient Video Understanding
facebookresearch/ijepa
Official codebase for I-JEPA, the Image-based Joint-Embedding Predictive Architecture. First outlined in the CVPR paper, "Self-supervised learning from images with a joint-embedding predictive architecture."
facebookresearch/jepa
PyTorch code and models for V-JEPA self-supervised learning from video.
state-spaces/s4
Structured state space sequence models
antoyang/VidChapters
[NeurIPS 2023 D&B] VidChapters-7M: Video Chapters at Scale
google-deepmind/perception_test
webdataset/webdataset
A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.
tatp22/multidim-positional-encoding
An implementation of 1D, 2D, and 3D positional encoding in Pytorch and TensorFlow
SwinTransformer/Video-Swin-Transformer
This is an official implementation for "Video Swin Transformers".
facebookresearch/active_indexing
Official implementation of "Active Image Indexing"
xvjiarui/VFS
Rethinking Self-Supervised Correspondence Learning: A Video Frame-level Similarity Perspective, in ICCV 2021 (Oral)
microsoft/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
lightly-ai/lightly
A python library for self-supervised learning on images.
Stability-AI/stablediffusion
High-Resolution Image Synthesis with Latent Diffusion Models
valeoai/sfrik
Official code for "Self-supervised learning with rotation-invariant kernels"
XuelianCheng/LEAStereo
Hierarchical Neural Architecture Searchfor Deep Stereo Matching (NeurIPS 2020)
antoyang/FrozenBiLM
[NeurIPS 2022] Zero-Shot Video Question Answering via Frozen Bidirectional Language Models
facebookresearch/VICRegL
VICRegL official code base
facebookresearch/msn
Masked Siamese Networks for Label-Efficient Learning (https://arxiv.org/abs/2204.07141)
zhyever/Monocular-Depth-Estimation-Toolbox
Monocular Depth Estimation Toolbox based on MMSegmentation.
CompVis/stable-diffusion
A latent text-to-image diffusion model
lliuz/ARFlow
The official PyTorch implementation of the paper "Learning by Analogy: Reliable Supervision from Transformations for Unsupervised Optical Flow Estimation".
princeton-vl/RAFT
MCG-NJU/VideoMAE
[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
bytedance/ibot
iBOT :robot:: Image BERT Pre-Training with Online Tokenizer (ICLR 2022)
facebookresearch/metaseq
Repo for external large-scale work