vit

There are 332 repositories under vit topic.

  • LaTeX-OCR

    lukas-blecher/LaTeX-OCR

    pix2tex: Using a ViT to convert images of equations into LaTeX code.

    Language:Python12.8k742721k
  • cmhungsteve/Awesome-Transformer-Attention

    An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites

  • towhee-io/towhee

    Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.

    Language:Python3.2k29665253
  • hila-chefer/Transformer-Explainability

    [CVPR 2021] Official PyTorch implementation for Transformer Interpretability Beyond Attention Visualization, a novel method to visualize classifications by Transformer based networks.

    Language:Jupyter Notebook1.8k2163241
  • open-compass/VLMEvalKit

    Open-source evaluation toolkit of large vision-language models (LVLMs), support 160+ VLMs, 50+ benchmarks

    Language:Python1.4k10209193
  • inference

    roboflow/inference

    A fast, easy-to-use, production-ready inference server for computer vision supporting deployment of many popular model architectures and fine-tuned models.

    Language:Python1.4k23135130
  • BR-IDL/PaddleViT

    :robot: PaddleViT: State-of-the-art Visual Transformer and MLP Models for PaddlePaddle 2.0+

    Language:Python1.2k10109319
  • yitu-opensource/T2T-ViT

    ICCV2021, Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet

    Language:Jupyter Notebook1.1k1876176
  • Yangzhangcst/Transformer-in-Computer-Vision

    A paper list of some recent Transformer-based CV works.

  • sail-sg/Adan

    Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models

    Language:Python76173564
  • v-iashin/video_features

    Extract video features from raw videos using multiple GPUs. We support RAFT flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, and TIMM models.

    Language:Python53667597
  • chinhsuanwu/mobilevit-pytorch

    A PyTorch implementation of "MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer"

    Language:Python50451770
  • zgcr/SimpleAICV_pytorch_training_examples

    SimpleAICV:pytorch training and testing examples.

    Language:Python42193295
  • FFCSonTheGo

    vatz88/FFCSonTheGo

    FFCS course registration made hassle free for VITians. Search courses and visualize the timetable on the go!

    Language:JavaScript29165483
  • gupta-abhay/pytorch-vit

    An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

    Language:Python28791234
  • PaddlePaddle/PASSL

    PASSL包含 SimCLR,MoCo v1/v2,BYOL,CLIP,PixPro,simsiam, SwAV, BEiT,MAE 等图像自监督算法以及 Vision Transformer,DEiT,Swin Transformer,CvT,T2T-ViT,MLP-Mixer,XCiT,ConvNeXt,PVTv2 等基础视觉算法

    Language:Python276113065
  • eeyhsong/EEG-Transformer

    i. A practical application of Transformer (ViT) on 2-D physiological signal (EEG) classification tasks. Also could be tried with EMG, EOG, ECG, etc. ii. Including the attention of spatial dimension (channel attention) and *temporal dimension*. iii. Common spatial pattern (CSP), an efficient feature enhancement method, realized with Python.

    Language:Python26731129
  • megvii-research/RevCol

    Official Code of Paper "Reversible Column Networks" "RevColv2"

    Language:Python250122010
  • HugsVision

    qanastek/HugsVision

    HugsVision is a easy to use huggingface wrapper for state-of-the-art computer vision

    Language:Jupyter Notebook19554021
  • kyegomez/NaViT

    My implementation of "Patch n’ Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution"

    Language:Python1857510
  • xmindflow/Awesome-Transformer-in-Medical-Imaging

    [MedIA Journal] An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites

  • SkyworkAI/MoH

    MoH: Multi-Head Attention as Mixture-of-Head Attention

    Language:Python157315
  • zwcolin/EEG-Transformer

    A ViT based transformer applied on multi-channel time-series EEG data for motor imagery classification

    Language:Python1572219
  • implus/mae_segmentation

    reproduction of semantic segmentation using masked autoencoder (mae)

    Language:Python1563714
  • yaoxiaoyuan/mimix

    Mimix: A Text Generation Tool and Pretrained Chinese Models

    Language:Python15232017
  • PaddlePaddle/PLSC

    Paddle Large Scale Classification Tools,supports ArcFace, CosFace, PartialFC, Data Parallel + Model Parallel. Model includes ResNet, ViT, Swin, DeiT, CaiT, FaceViT, MoCo, MAE, ConvMAE, CAE.

    Language:Python150214335
  • kyegomez/Vit-RGTS

    Open source implementation of "Vision Transformers Need Registers"

    Language:Python1435613
  • hunto/LightViT

    Official implementation for paper "LightViT: Towards Light-Weight Convolution-Free Vision Transformers"

    Language:Python1372710
  • s-chh/PyTorch-Scratch-Vision-Transformer-ViT

    Simple and easy to understand PyTorch implementation of Vision Transformer (ViT) from scratch with detailed steps. Tested on small datasets: MNIST, FashionMNIST, SVHN, CIFAR10, and CIFAR100.

    Language:Python1202317
  • jaehyunnn/ViTPose_pytorch

    An unofficial implementation of ViTPose [Y. Xu et al., 2022]

    Language:Jupyter Notebook10611821
  • vitjs/vit

    🚀 React application framework inspired by UmiJS / 类 UmiJS 的 React 应用框架

    Language:TypeScript100387
  • kamalkraj/Vision-Transformer

    Vision Transformer using TensorFlow 2.0

    Language:Python964319
  • DefTruth/Awesome-SD-Inference

    📖A small curated list of Awesome SD/DiT/ViT/Diffusion Inference with Distributed/Caching/Sampling: DistriFusion, PipeFusion, AsyncDiff, DeepCache, Block Caching etc.

  • rasbt/pytorch-memory-optim

    This code repository contains the code used for my "Optimizing Memory Usage for Training LLMs and Vision Transformers in PyTorch" blog post.

    Language:Python864111
  • daniel-code/TubeViT

    An unofficial implementation of TubeViT in "Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning"

    Language:Python8510139
  • NeRF-MAE

    zubair-irshad/NeRF-MAE

    [ECCV 2024] Pytorch code for our ECCV'24 paper NeRF-MAE: Masked AutoEncoders for Self-Supervised 3D Representation Learning for Neural Radiance Fields

    Language:Python82823