difference between vision_lstm and vision_lstm2?
dongzhuoyao opened this issue · 1 comments
dongzhuoyao commented
difference between vision_lstm and vision_lstm2?
BenediktAlkin commented
Updated the README for easier visibility of this info.
Changes:
- Conv2d with kernelsize 3 instead of causal Conv1d before q and k
- biases in layernorms and projection layers
- concatenate first and last token instead of average them
- pre-trained models are pre-trained on 192x192 resolution followed by a short fine-tuning on 224x224