This is a PyTorch implementation of MobileViT specified in "MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer", arXiv 2021.
import torch
from mobilevit import mobilevit_xxs
net = mobilevit_xxs()
img = torch.randn(1, 3, 256, 256)
out = net(img)
@article{mehta2021mobilevit,
title={MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer},
author={Mehta, Sachin and Rastegari, Mohammad},
journal={arXiv preprint arXiv:2110.02178},
year={2021}
}
Code adapted from MobileNetV2 and ViT.