xhl-video

UCSCSan Jose

xhl-video's Stars

LiheYoung/Depth-Anything
[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation
Language:Python7.4k 51 225562
LargeWorldModel/LWM
Large World Model -- Modeling Text and Video with Millions Context
Language:Python7.3k 66 72557
FoundationVision/VAR
[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!
Language:Jupyter Notebook7k 103 136446
google-research/big_vision
Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.
Language:Jupyter Notebook2.7k 39 72177
ytongbai/LVM
Language:Python1.8k 116 2459
MCG-NJU/VideoMAE
[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
Language:Python1.5k 15 125143
apple/ml-aim
This repository provides the code and model checkpoints for AIMv1 and AIMv2 research projects.
Language:Python1.2k 26 2760
Sense-X/UniFormer
[ICLR2022] official implementation of UniFormer
Language:Python849 10 133113
bytedance/ibot
iBOT :robot:: Image BERT Pre-Training with Online Tokenizer (ICLR 2022)
Language:Jupyter Notebook714 5 3580
facebookresearch/flip
Official Open Source code for "Scaling Language-Image Pre-training via Masking"
Language:Python415 6 216
UCSC-VLAA/CLIPA
[NeurIPS 2023] This repository includes the official implementation of our paper "An Inverse Scaling Law for CLIP Training"
Language:Python311 14 1214
Beckschen/3D-TransUNet
This is the official repository for the paper "3D TransUNet: Advancing Medical Image Segmentation through Vision Transformers"
Language:Python237 3 3420
ytongbai/ViTs-vs-CNNs
[NeurIPS 2021]: Are Transformers More Robust Than CNNs? (Pytorch implementation & checkpoints)
Language:Python177 15 910
UCSC-VLAA/RobustCNN
[ICLR 2023] This repository includes the official implementation our paper "Can CNNs Be More Robust Than Transformers?"
Language:Python143 3 113
UCSC-VLAA/Recap-DataComp-1B
This is the official repository of our paper "What If We Recaption Billions of Web Images with LLaMA-3 ?"
128 5 161
ggjy/DeLVM
Language:Python114 2 108
UCSC-VLAA/DMAE
[CVPR 2023] This repository includes the official implementation our paper "Masked Autoencoders Enable Efficient Knowledge Distillers"
Language:Python104 3 85
OliverRensu/D-iGPT
[ICML 2024] This repository includes the official implementation of our paper "Rejuvenating image-GPT as Strong Visual Representation Learners"
Language:Python97 3 08
patil-suraj/vit-vqgan
JAX implementation ViT-VQGAN
Language:Python82 6 1411
OliverRensu/ARM
This repository is the official implementation of our Autoregressive Pretraining with Mamba in Vision
Language:Python70 2 71
nazmul-karim170/UNICON
[CVPR'22] Official Implementation of the CVPR 2022 paper "UNICON: Combating Label Noise Through Uniform Selection and Contrastive Learning"
Language:Python61 2 1215
OliverRensu/MVG
Language:Python56 3 25
UCSC-VLAA/CRATE-alpha
This repository includes the official implementation our paper "Scaling White-Box Transformers for Vision"
Language:Python46 3 21
UCSC-VLAA/EVP
[TMLR'24] This repository includes the official implementation our paper "Unleashing the Power of Visual Prompting At the Pixel Level"
Language:Python39 1 04
meijieru/fast_advprop
[ICLR 2022]: Fast AdvProp
Language:Python34 5 10
yuyinzhou/L2B
This repository includes the official project of L2B, from our paper "Learning to Bootstrap for Combating Label Noise".
Language:Python32 3 53
UCSC-VLAA/CLIPS
An Enhanced CLIP Framework for Learning with Synthetic Captions
Language:Python27 3 31
UCSC-VLAA/FedConv
[TMLR'24] This repository includes the official implementation our paper "FedConv: Enhancing Convolutional Neural Networks for Handling Data Heterogeneity in Federated Learning"
Language:Python25 1 00
UCSC-VLAA/AdvXL
[CVPR 2024] This repository includes the official implementation our paper "Revisiting Adversarial Training at Scale"
Language:Python19 2 51
UCSC-VLAA/Image-Pretraining-for-Video
[ECCV 2022] This repository includes the official implementation our paper "In Defense of Image Pre-Training for Spatiotemporal Recognition".
Language:Python19 0 10

xhl-video

xhl-video's Stars

LiheYoung/Depth-Anything

LargeWorldModel/LWM

FoundationVision/VAR

google-research/big_vision

ytongbai/LVM

MCG-NJU/VideoMAE

apple/ml-aim

Sense-X/UniFormer

bytedance/ibot

facebookresearch/flip

UCSC-VLAA/CLIPA

Beckschen/3D-TransUNet

ytongbai/ViTs-vs-CNNs

UCSC-VLAA/RobustCNN

UCSC-VLAA/Recap-DataComp-1B

ggjy/DeLVM

UCSC-VLAA/DMAE

OliverRensu/D-iGPT

patil-suraj/vit-vqgan

OliverRensu/ARM

nazmul-karim170/UNICON

OliverRensu/MVG

UCSC-VLAA/CRATE-alpha

UCSC-VLAA/EVP

meijieru/fast_advprop

yuyinzhou/L2B

UCSC-VLAA/CLIPS

UCSC-VLAA/FedConv

UCSC-VLAA/AdvXL

UCSC-VLAA/Image-Pretraining-for-Video