The official code repo for the paper "Alternating Differentiation for Optimization Layers" (ICLR 2023).
Alt-Diff is an algorithm that decouples optimization layers in a fast and recursive way. Assuming
We have uploaded numerical experiments and image classification test on MNIST and CIFAR-10 to substantiate the efficiency of Alt-Diff.
The numerical_experiment
directory contains randomly generated parameters that can be adjusted to test different dimensions. These experiments are designed to compare the runtime and results of different algorithms.
In the classification
directory, you can find examples demonstrating the integration of Alt-diff into neural network training processes.
For training the optimization layer in image classification tasks, execute the following command:
cd classification
python train.py --dataset cifar-10 [MODEL_NAME]
We strongly recommend running the code in a CPU environment when comparing our model with existing methods, as qpth
benefits from GPU acceleration, introducing an unfair advantage otherwise.
2023/3/16 Update the numerical experiments with the optimization parameter
2024/1/8 Update the classification test to show the combination of optimization layer within neural networks.
Please note that we will continuously add more detailed code examples and implement GPU acceleration in Alt-Diff to the repository in the future.
If you find this paper useful in your research, please consider citing:
@inproceedings{
sun2023alternating,
title={Alternating Differentiation for Optimization Layers},
author={Haixiang Sun and Ye Shi and Jingya Wang and Hoang Duong Tuan and H. Vincent Poor and Dacheng Tao},
booktitle={The Eleventh International Conference on Learning Representations},
year={2023},
url={https://openreview.net/forum?id=KKBMz-EL4tD}
}