/Neural_ePDOs

Official code for "Neural ePDOs: Spatially Adaptive Equivariant Partial Differential Operator Based Networks"

Primary LanguagePython

Neural ePDOs: Spatially Adaptive Equivariant Partial Differential Operator Based Network

Abstract

Endowing deep learning models with symmetry priors can lead to a considerable performance improvement. As an interesting bridge between physics and deep learning, the equivariant partial differential operators (PDOs) have drawn much researchers' attention recently. However, to ensure the PDOs translation equivariance, previous works have to require coefficient matrices to be constant and spatially shared for their linearity, which could lead to the sub-optimal feature learning at each position. In this work, we propose a novel nonlinear PDOs scheme that is both spatially adaptive and translation equivariant. The coefficient matrices are obtained by local features through a generator rather than spatially shared. Besides, we establish a new theory on incorporating more equivariance like rotations for such PDOs. Based on our theoretical results, we efficiently implement the generator with an equivariant multilayer perceptron (EMLP). As such equivariant PDOs are generated by neural networks, we call them Neural ePDOs. In experiments, we show that our method can significantly improve previous works with smaller model size in various datasets. Especially, we achieve the state-of-the-art performance on the MNIST-rot dataset with only half parameters of the previous best model.

Requirement

python=3.8.13
tqdm
pytorch=1.10.2
torchvision=0.11.3
cuda=11.3.1
numpy=1.22.3
rbf 

(For installation of rbf, see RBF)

Experiment Settings

MNIST-rot

  • Model: Regular, Quotient
  • Training batch size: 64
  • Weight decay: 1e-4
  • Learning rate adjustment
    1. 2e-3 for epoch [0, 60)
    2. 2e-4 for epoch [60, 120)
    3. 1e-4 for epoch [120, 150)
    4. 5e-5 for epoch [150, 180)
    5. 2.5e-5 for epoch [180, 200]
  • Reduction: 1 (for regular model), 2 (for quotient model)
  • g ($g=\frac{z}{q}$, $z$:number of input fields, $q$:partition number): 4
  • dropout rate: 0.1
  • s (groups number, should be exactly divisible by g):4 for $C_{16}$ regular model, 1 for quotient and $D_{16|5}C_{16}$ model.

1. MNIST-rot

  • Neural ePDOs(Regular, FD)
CUDA_VISIBLE_DEVICES=0 python train_mnist.py   --model R  --dis fd --g 4 --s 4 --reduction 1
  • Neural ePDOs(Regular, Gauss)
CUDA_VISIBLE_DEVICES=0 python train_mnist.py  --model R  --dis gauss --g 4 --s 4 --reduction 1 
  • Neural ePDOs(Quotient, Gauss)
CUDA_VISIBLE_DEVICES=0 python train_mnist.py  --model Q  --dis gauss --g 4 --s 1 --reduction 2
  • Neural ePDOs(Regular $D_{16|5}C_{16}$ , Gauss)
 CUDA_VISIBLE_DEVICES=0 python train_mnist.py --model R --dis gauss --flip True --g 4 --s 1  --reduction 1 

Citations

@inproceedings{
he2023neural,
title={Neural e{PDO}s: Spatially Adaptive Equivariant Partial Differential Operator Based  Networks},
author={Lingshen He and Yuxuan Chen and Zhengyang Shen and Yibo Yang and Zhouchen Lin},
booktitle={The Eleventh International Conference on Learning Representations },
year={2023},
url={https://openreview.net/forum?id=D1Iqfm7WTkk}
}