- Add RPE, Rotary positional embeddings
- Fix experiment code, update models to work without separate config
- Test on TPUv3-8
- Run first training runs comparing DeiT with absolute learned vs. rotary pos embeddings
- Add class-attention layers, layerscale (CaiT)
- Add CvT
- Add TNT, Twins
NZ99/self-attention-experiments-vision
A project about replicating, evaluating and scaling up self-attention based models in vision.
Python