About Vit encoder output consistency during inference?
xiao2mo opened this issue · 4 comments
xiao2mo commented
Hi, does the vit implementation has been fully tested in terms of consistency?
I've found that the encoder output is totally different.
zjersey commented
We have tested the consistency of VIT, you can refer to the infer example to check your usage is correct.
xiao2mo commented
Can I have your wechat, I've got some problems in openai vit transform.
Is the VIT you mentioned is the modeling_vit in huggingface?
It seems that the encoder implenmentation is same as that of bert in lightseq.
In other words, Why self_attention and ffn_add_norm in vit_encoder.cc.cu and bert_encoder.cc.cu are identical?
zjersey commented
Yes, it's HuggingFace's modeling_vit.
vit and bert have the same structure except the embedding layer.
Anychnn commented
你好, ?xml:namespace>