This repository is the official implementation of Integration of Neural Network-Based Symbolic Regression in Deep Learning for Scientific Discovery
Symbolic regression, in which a model discovers an analytical equation describing a dataset as opposed to finding the weights of pre-defined features, is normally implemented using genetic programming. Here, we demonstrate symbolic regression using neural networks, which allows symbolic regression to be performed using gradient-based optimization techniques, i.e., backpropagation. It can be integrated with other deep learning architectures, allowing for end-to-end training of a system that produces interpretable and generalizable results. This repository implements symbolic regression with neural networks and demonstrates its integration with deep learning for multiple tasks, including arithmetic on MNIST digits and extracting the equations of kinematic and differential equation datasets.
- Python 3.5
- TensorFlow 1.13 (SymPy pretty_print does not work 1.14)
- NumPy 1.16 (does not work on 1.17)
- Scipy 1.3
- Matplotlib
All dependencies are in requirements.txt. To install required packages, you can simply run the following code in your shell.
pip install -r requirements.txt
Each task are trained independently. Refer to the paper https://arxiv.org/abs/1912.04825 for a description of each of the tasks.
benchmark_accuracy.py
/benchmark_accuracy_l0.py
:
Run EQL benchmarks on various functions using smoothed L0.5 and relaxed L0 regularization, respectively.
mnist_math.py
: Learn arithmetic operations on MNIST digits.
mnist_math_relu.py
: Same as mnist_math.py
,
but using a conventional neural network with ReLU activation functions instead of the EQL network.
kinematics_data.py
: Generate data for the kinematics task. This must be run before training the model.
kinematics_sr.py
/kinematics_sr_l0.py
: Dynamics encoder combined with a recurrent EQL network for the kinematics task
using smoothed L0.5 and relaxed L0 regularization, respectively.
kinematics_sr.py
implements an unrolled RNN to demonstrate the internal architecture, while
kinematics_sr_l0.py
implements the RNN using the built-in TensorFlow libraries.
kinematics_relu.py
: Same as kinematics_sr.py
but using a conventional neural network with ReLU activation functions instead of the EQL network.
sho_data.py
: Generatedata for the SHO task. This must be run before training the model.
sho_sr.py
/sho_sr_l0.py
: Dynamics encoder combined with a recurrent EQL network for the kinematics task
using smoothed L0.5 and relaxed L0 regularization, respectively.
Both implement the RNN using the built-in TensorFlow libraries.
sho_relu.py
: Same as sho_sr.py
but using a conventional neural network with ReLU activation functions instead of the EQL network.
Samuel Kim, Peter Lu, Srijon Mukherjee, Michael Gilbert, Li Jing, Vladimir Ceperic, Marin Soljacic
If you'd like to contribute, or have any suggestions for these guidelines, you can contact Samuel Kim at samkim (at) mit (dot) edu or open an issue on this GitHub repository.
All content in this repository is licensed under the MIT license.