About Vit encoder output consistency during inference?

Question

About Vit encoder output consistency during inference?

xiao2mo opened this issue 2 years ago · 4 comments

Hi, does the vit implementation has been fully tested in terms of consistency?
I've found that the encoder output is totally different.

Answer 1 · 2023-03-16T11:39:47.000Z

We have tested the consistency of VIT, you can refer to the infer example to check your usage is correct.

Answer 2 · 2023-03-16T11:52:16.000Z

Can I have your wechat, I've got some problems in openai vit transform.
Is the VIT you mentioned is the modeling_vit in huggingface?
It seems that the encoder implenmentation is same as that of bert in lightseq.
In other words, Why self_attention and ffn_add_norm in vit_encoder.cc.cu and bert_encoder.cc.cu are identical?

Answer 3 · 2023-03-28T16:17:45.000Z

Yes, it's HuggingFace's modeling_vit.

vit and bert have the same structure except the embedding layer.

Answer 4 · 2023-03-28T16:18:08.000Z

你好， ?xml:namespace>