Pinned Repositories
A-ViT
Official PyTorch implementation of A-ViT: Adaptive Tokens for Efficient Vision Transformer (CVPR 2022)
Ask-Anything
[VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
CaFo
[CVPR 2023] Prompt, Generate, then Cache: Cascade of Foundation Models makes Strong Few-shot Learners
Chal_SLR
This repo contains the official code of our work SAM-SLR which won the CVPR 2021 Challenge on Large Scale Signer Independent Isolated Sign Language Recognition.
data_processing_parallel
LineFollower
robot_navigation_deep_rl_gazebo
tiny_codes
ArSL21L
FishEye8K
FishEye8K: A Benchmark and Dataset for Fisheye Camera Object Detection
ganzobtn's Repositories
ganzobtn/A-ViT
Official PyTorch implementation of A-ViT: Adaptive Tokens for Efficient Vision Transformer (CVPR 2022)
ganzobtn/Ask-Anything
[VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
ganzobtn/CaFo
[CVPR 2023] Prompt, Generate, then Cache: Cascade of Foundation Models makes Strong Few-shot Learners
ganzobtn/Chal_SLR
This repo contains the official code of our work SAM-SLR which won the CVPR 2021 Challenge on Large Scale Signer Independent Isolated Sign Language Recognition.
ganzobtn/data_processing_parallel
ganzobtn/LineFollower
ganzobtn/robot_navigation_deep_rl_gazebo
ganzobtn/tiny_codes
ganzobtn/CLIP
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
ganzobtn/diffusers
🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch
ganzobtn/diffusion
Denoising Diffusion Probabilistic Models
ganzobtn/ganzobtn.github.io
ganzobtn/InternVideo
InternVideo: General Video Foundation Models via Generative and Discriminative Learning (https://arxiv.org/abs/2212.03191)
ganzobtn/jepa
PyTorch code and models for V-JEPA self-supervised learning from video.
ganzobtn/llama
Inference code for LLaMA models
ganzobtn/nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
ganzobtn/openpose
OpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation
ganzobtn/pytorch_distributed
ganzobtn/roadr2023
ganzobtn/SignLanguageRetrieval
ganzobtn/SlowFast
PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.
ganzobtn/SS_Vision_Transformer
Official repository for "Self-Supervised Video Transformer" (CVPR'22)
ganzobtn/TinyChatEngine
TinyChatEngine: On-Device LLM Inference Library
ganzobtn/Video-Swin-Transformer
This is an official implementation for "Video Swin Transformers".
ganzobtn/video_data_preprocess
ganzobtn/VideoMamba
VideoMamba: State Space Model for Efficient Video Understanding
ganzobtn/WLASL
WACV 2020 "Word-level Deep Sign Language Recognition from Video: A New Large-scale Dataset and Methods Comparison"
ganzobtn/yolov7
Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors