lucidrains/vit-pytorch

The Total params: and Params size (MB) of the model printed by summary are different from the bit_base model in timm library. Theoretically, the same settings should be the same. What is the reason?

Opened this issue · 1 comments

import torch
from vit import ViT
from torchsummary import summary
import timm
v = ViT(
image_size = 224,
patch_size = 16,
num_classes = 1000,
dim = 768,
depth = 12,
heads = 12,
mlp_dim = 3072,
dropout = 0.1,
emb_dropout = 0.1
)

使用 summary 显示模型的摘要

summary(v, input_size=(3, 224, 224), device='cpu') # 传入输入的形状 (C, H, W)

加载 ViT-B/16 模型

model = timm.create_model('vit_base_patch16_224', pretrained=False)

打印模型的摘要信息

summary(model, input_size=( 3, 224, 224))

image
The result of the first diagram is the model you wrote. The result of the second graph is timm's.
image