question about Kinetics400 88.7
Liu-arch opened this issue · 2 comments
Liu-arch commented
ViT-L/14; 16x336 4x3 top1= 88.7 on Kinetics400. However, the accuracy in the corresponding log link is not 88.7 but only 87.9.
whwu95 commented
Note that the log provides the result of 1x1 view inference (i.e., 1 clip x 1 crop).
Liu-arch commented
Dear Author,
Thank you very much for writing back.
In response to the diagram below as well as the code.
Three very confusing points for me are:
1. Category embedding and attributes embedding which I feel are a bit confused and not even differentiated in the code.
2. The second thing is that I don't see SA in the code.
3. What is the meaning of logit_scale in train.py? Is it related to Category embedding and attributes embedding?
Thank you very much!
…------------------ 原始邮件 ------------------
发件人: "whwu95/BIKE" ***@***.***>;
发送时间: 2023年9月13日(星期三) 中午11:40
***@***.***>;
***@***.******@***.***>;
主题: Re: [whwu95/BIKE] question about Kinetics400 88.7 (Issue #8)
Note that the log provides the result of 1x1 view inference (i.e., 1 clip x 1 crop).
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: ***@***.***>