/lambda.pytorch

PyTorch implementation of Lambda Network and pretrained Lambda-ResNet

Primary LanguagePython

lambda.pytorch

[NEW!] Check out our latest work involution in CVPR'21 that bridges convolution and self-attention operators.


PyTorch implementation of LambdaNetworks: Modeling long-range Interactions without Attention.

Lambda Networks apply associative law of matrix multiplication to reverse the computing order of self-attention, achieving the linear computation complexity regarding content interactions.

Similar techniques have been used previously in A2-Net and CGNL. Check out a collection of self-attention modules in another repository dot-product-attention.

Training Configuration

✓ SGD optimizer, initial learning rate 0.1, momentum 0.9, weight decay 0.0001

✓ epoch 130, batch size 256, 8x Tesla V100 GPUs, LR decay strategy cosine

✓ label smoothing 0.1

Pre-trained checkpoints

Architecture Parameters FLOPs Top-1 / Top-5 Acc. (%) Download
Lambda-ResNet-50 14.995M 6.576G 78.208 / 93.820 model | log

Citation

If you find this repository useful in your research, please cite

@InProceedings{Li_2021_CVPR,
author = {Li, Duo and Hu, Jie and Wang, Changhu and Li, Xiangtai and She, Qi and Zhu, Lei and Zhang, Tong and Chen, Qifeng},
title = {Involution: Inverting the Inherence of Convolution for Visual Recognition},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2021}
}
@inproceedings{
bello2021lambdanetworks,
title={LambdaNetworks: Modeling long-range Interactions without Attention},
author={Irwan Bello},
booktitle={International Conference on Learning Representations},
year={2021},
}