vit

There are 294 repositories under vit topic.

  • LaTeX-OCR

    lukas-blecher/LaTeX-OCR

    pix2tex: Using a ViT to convert images of equations into LaTeX code.

    Language:Python11.1k68257922
  • cmhungsteve/Awesome-Transformer-Attention

    An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites

  • towhee-io/towhee

    Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.

    Language:Python3k29654240
  • hila-chefer/Transformer-Explainability

    [CVPR 2021] Official PyTorch implementation for Transformer Interpretability Beyond Attention Visualization, a novel method to visualize classifications by Transformer based networks.

    Language:Jupyter Notebook1.7k2161227
  • BR-IDL/PaddleViT

    :robot: PaddleViT: State-of-the-art Visual Transformer and MLP Models for PaddlePaddle 2.0+

    Language:Python1.2k10109315
  • yitu-opensource/T2T-ViT

    ICCV2021, Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet

    Language:Jupyter Notebook1.1k1876173
  • inference

    roboflow/inference

    A fast, easy-to-use, production-ready inference server for computer vision supporting deployment of many popular model architectures and fine-tuned models.

    Language:Python1.1k198378
  • Yangzhangcst/Transformer-in-Computer-Vision

    A paper list of some recent Transformer-based CV works.

  • sail-sg/Adan

    Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models

    Language:Python72873563
  • open-compass/VLMEvalKit

    Open-source evaluation toolkit of large vision-language models (LVLMs), support GPT-4v, Gemini, QwenVLPlus, 50+ HF models, 20+ benchmarks

    Language:Python50686959
  • chinhsuanwu/mobilevit-pytorch

    A PyTorch implementation of "MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer"

    Language:Python47551767
  • v-iashin/video_features

    Extract video features from raw videos using multiple GPUs. We support RAFT flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, and TIMM models.

    Language:Python45567089
  • zgcr/SimpleAICV_pytorch_training_examples

    SimpleAICV:pytorch training and testing examples.

    Language:Jupyter Notebook40493295
  • FFCSonTheGo

    vatz88/FFCSonTheGo

    FFCS course registration made hassle free for VITians. Search courses and visualize the timetable on the go!

    Language:JavaScript28465171
  • gupta-abhay/pytorch-vit

    An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

    Language:Python28091233
  • PaddlePaddle/PASSL

    PASSL包含 SimCLR,MoCo v1/v2,BYOL,CLIP,PixPro,simsiam, SwAV, BEiT,MAE 等图像自监督算法以及 Vision Transformer,DEiT,Swin Transformer,CvT,T2T-ViT,MLP-Mixer,XCiT,ConvNeXt,PVTv2 等基础视觉算法

    Language:Python263122863
  • megvii-research/RevCol

    Official Code of Paper "Reversible Column Networks" "RevColv2"

    Language:Python245121810
  • eeyhsong/EEG-Transformer

    i. A practical application of Transformer (ViT) on 2-D physiological signal (EEG) classification tasks. Also could be tried with EMG, EOG, ECG, etc. ii. Including the attention of spatial dimension (channel attention) and *temporal dimension*. iii. Common spatial pattern (CSP), an efficient feature enhancement method, realized with Python.

    Language:Python22531126
  • HugsVision

    qanastek/HugsVision

    HugsVision is a easy to use huggingface wrapper for state-of-the-art computer vision

    Language:Jupyter Notebook18854021
  • yaoxiaoyuan/mimix

    Mimix: A Text Generation Tool and Pretrained Chinese Models

    Language:Python14732016
  • implus/mae_segmentation

    reproduction of semantic segmentation using masked autoencoder (mae)

    Language:Python1463614
  • PaddlePaddle/PLSC

    Paddle Large Scale Classification Tools,supports ArcFace, CosFace, PartialFC, Data Parallel + Model Parallel. Model includes ResNet, ViT, Swin, DeiT, CaiT, FaceViT, MoCo, MAE, ConvMAE, CAE.

    Language:Python144214333
  • xmindflow/Awesome-Transformer-in-Medical-Imaging

    [MedIA Journal] An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites

  • hunto/LightViT

    Official implementation for paper "LightViT: Towards Light-Weight Convolution-Free Vision Transformers"

    Language:Python1342710
  • zwcolin/EEG-Transformer

    A ViT based transformer applied on multi-channel time-series EEG data for motor imagery classification

    Language:Python1342217
  • kyegomez/NaViT

    My implementation of "Patch n’ Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution"

    Language:Python133736
  • vitjs/vit

    🚀 React application framework inspired by UmiJS / 类 UmiJS 的 React 应用框架

    Language:TypeScript100387
  • kyegomez/Vit-RGTS

    Open source implementation of "Vision Transformers Need Registers"

    Language:Python994311
  • kamalkraj/Vision-Transformer

    Vision Transformer using TensorFlow 2.0

    Language:Python944319
  • jaehyunnn/ViTPose_pytorch

    An unofficial implementation of ViTPose [Y. Xu et al., 2022]

    Language:Jupyter Notebook9211518
  • rasbt/pytorch-memory-optim

    This code repository contains the code used for my "Optimizing Memory Usage for Training LLMs and Vision Transformers in PyTorch" blog post.

    Language:Python824111
  • ssitvit/Code-Canvas

    A hub for innovation through web development projects

    Language:JavaScript804374133
  • s-chh/PyTorch-Vision-Transformer-ViT-MNIST-CIFAR10

    Simplified Pytorch implementation of Vision Transformer (ViT) for small datasets like MNIST, FashionMNIST, SVHN and CIFAR10.

    Language:Python7711
  • daniel-code/TubeViT

    An unofficial implementation of TubeViT in "Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning"

    Language:Python7610137
  • hunto/image_classification_sota

    Training ImageNet / CIFAR models with sota strategies and fancy techniques such as ViT, KD, Rep, etc.

    Language:Python6831010
  • uta-smile/TVT

    Code of TVT: Transferable Vision Transformer for Unsupervised Domain Adaptation, WACV 2023

    Language:Python6551611