Issues
- 1
为什么我做分割效果和论文说的不一样,还没有vit好
#29 opened by YanGe1105 - 1
computing rate reduction in CRATE
#28 opened by heeseokjung - 1
the final task-specific architecture is classification head as the paper informed,but why you show demo in segmentation task?
#25 opened by sanwei111 - 1
- 1
where is the inference code?
#27 opened by sanwei111 - 0
- 1
more pretrained weights
#8 opened by idonashino - 1
Experiment on Diffusion Models
#15 opened by yuzheyao22 - 1
Confusion about the Code Implementation
#18 opened by HenryLau7 - 1
Is there any example for language?
#12 opened by subercui - 1
How CREAT differs from Transformer
#11 opened by moon2yue - 0
Linear projection instead of convolution
#17 opened by LukasMahieu - 2
Can this be applied to languages?
#20 opened by ElrondL - 0
关于attention中部分代码的问题
#19 opened by 01vanilla - 0
ask for Figure13、14 code
#16 opened by 01vanilla - 0
- 6
pretrained CRATE weight?
#1 opened by EveningLin - 3
KeyError:'model'
#5 opened by yiichu03 - 1
Taking one further step of whitebox approach
#9 opened by ngkel - 1
The white-box explannation of CLS token
#10 opened by ngkel - 0
- 2
requirement
#2 opened by suyou5