Wavelet Convolutions for Large Receptive Fields [ECCV 2024]

Shahaf E. Finder, Roy Amoyal, Eran Treister, and Oren Freifeld

WTConv illustration

Requirements

Python 3.12
timm 1.0.7
PyWavelets 1.6.0

How to use

You can import WTConv and use it in your CNN

from wtconv import WTConv2d

conv_dw = WTConv2d(32, 32, kernel_size=5, wt_levels=3)

Or you can use WTConvNeXt through timm's model registry

import wtconvnext

model = create_model(
    "wtconvnext_tiny",
    pretrained=False,
    num_classes=1000
)

Results and Trained Models

ImageNet-1K

name	resolution	acc@1	#params	FLOPs	model
WTConvNeXt-T	224x224	82.5	30M	4.5G	model
WTConvNeXt-S	224x224	83.6	54M	8.8G	model
WTConvNeXt-B	224x224	84.1	93M	15.5G	model

Training and Validating WTConvNeXt

Training WTConvNeXt on ImageNet-1k

You can use this script, taken from the timm library, to train WTConvNeXt-T:

python train.py --model wtconvnext_tiny --drop-path 0.1 \
                --data-dir IMAGENET_PATH \
                --epochs 300 --warmup-epochs 20 \
                --batch-size 64 --grad-accum-steps 64 --sched-on-updates \
                --lr 4e-3 --weight-decay 5e-2 \
                --opt adamw --layer-decay 1.0 \
                --aa rand-m9-mstd0.5-inc1 \
                --reprob 0.25 --mixup 0.8 --cutmix 1.0 \
                --model-ema --model-ema-decay 0.9999 \
                --output checkpoints/wtconvnext_tiny_300/

You can use torchrun to distribute the training, just note that the effective batch size should be 4096 (gpus * batch-size * grad-accum-steps = 4096).
E.q., we've trained the network using a single machine with 4 GPUs, hence set batch-size to 64 and grad-accum-steps to 16.

torchrun --nproc-per-node=4  \
         python train.py --model wtconvnext_tiny --drop-path 0.1 \
                --data-dir IMAGENET_PATH \
                --epochs 300 --warmup-epochs 20 \
                --batch-size 64 --grad-accum-steps 16 --sched-on-updates \
                --lr 4e-3 --weight-decay 5e-2 \
                --opt adamw --layer-decay 1.0 \
                --aa rand-m9-mstd0.5-inc1 \
                --reprob 0.25 --mixup 0.8 --cutmix 1.0 \
                --model-ema --model-ema-decay 0.9999 \
                --output checkpoints/wtconvnext_tiny_300/

Other network sizes:

WTConvNeXt-S

Single GPU

python train.py --model wtconvnext_small --drop-path 0.4 \
                --data-dir IMAGENET_PATH \
                --epochs 300 --warmup-epochs 20 \
                --batch-size 64 --grad-accum-steps 64 --sched-on-updates \
                --lr 4e-3 --weight-decay 5e-2 \
                --opt adamw --layer-decay 1.0 \
                --aa rand-m9-mstd0.5-inc1 \
                --reprob 0.25 --mixup 0.8 --cutmix 1.0 \
                --model-ema --model-ema-decay 0.9999 \
                --output checkpoints/wtconvnext_tiny_300/

Multi-GPU

torchrun --nproc-per-node=4  \
         python train.py --model wtconvnext_small --drop-path 0.1 \
                --data-dir IMAGENET_PATH \
                --epochs 300 --warmup-epochs 20 \
                --batch-size 64 --grad-accum-steps 16 --sched-on-updates \
                --lr 4e-3 --weight-decay 5e-2 \
                --opt adamw --layer-decay 1.0 \
                --aa rand-m9-mstd0.5-inc1 \
                --reprob 0.25 --mixup 0.8 --cutmix 1.0 \
                --model-ema --model-ema-decay 0.9999 \
                --output checkpoints/wtconvnext_tiny_300/

WTConvNeXt-B

Single GPU

python train.py --model wtconvnext_base --drop-path 0.4 \
                --data-dir IMAGENET_PATH \
                --epochs 300 --warmup-epochs 20 \
                --batch-size 64 --grad-accum-steps 64 --sched-on-updates \
                --lr 4e-3 --weight-decay 5e-2 \
                --opt adamw --layer-decay 1.0 \
                --aa rand-m9-mstd0.5-inc1 \
                --reprob 0.25 --mixup 0.8 --cutmix 1.0 \
                --model-ema --model-ema-decay 0.9999 \
                --output checkpoints/wtconvnext_tiny_300/

Multi-GPU

torchrun --nproc-per-node=4  \
         python train.py --model wtconvnext_base --drop-path 0.5 \
                --data-dir IMAGENET_PATH \
                --epochs 300 --warmup-epochs 20 \
                --batch-size 64 --grad-accum-steps 16 --sched-on-updates \
                --lr 4e-3 --weight-decay 5e-2 \
                --opt adamw --layer-decay 1.0 \
                --aa rand-m9-mstd0.5-inc1 \
                --reprob 0.25 --mixup 0.8 --cutmix 1.0 \
                --model-ema --model-ema-decay 0.9999 \
                --output checkpoints/wtconvnext_tiny_300/

Evaluating WTConvNeXt on ImageNet-1k

You can use this script, taken from the timm library, to validate the results:

python validate.py --model wtconvnext_tiny \
                   --data-dir IMAGENET_PATH \
                   --checkpoint WTConvNeXt_tiny_5_300e_ema.pth

Acknowledgement

The code for WTConvNeXt, as well as the training and validating scripts, were adapted from the timm library.

License

This project is released under the MIT license. Please see the LICENSE file for more information.

Citation

If you find this repository helpful, please consider citing:

@inproceedings{finder2024wavelet,
  title     = {Wavelet Convolutions for Large Receptive Fields},
  author    = {Finder, Shahaf E and Amoyal, Roy and Treister, Eran and Freifeld, Oren},
  booktitle = {European Conference on Computer Vision},
  year      = {2024},
}

BGU-CS-VIL/WTConv