Deep symbolic optimization (DSO) is a deep learning framework for symbolic optimization tasks. The package dso
includes the core symbolic optimization algorithms, as well as support for two particular symbolic optimization tasks: (1) symbolic regression (recovering tractable mathematical expressions from an input dataset) and (2) discovering symbolic policies for reinforcement learning environments. In the code, these tasks are referred to as regression
and control
, respectively. We also include a simple interface for defining new tasks.
This repository contains code supporting the following publications:
- Petersen et al. 2021 Deep symbolic regression: Recovering mathematical expressions from data via risk-seeking policy gradients. ICLR 2021. Oral Paper
- Landajuela et al. 2021 Discovering symbolic policies with deep reinforcement learning. ICML 2021. Paper
- Mundhenk et al. 2021 Symbolic Regression via Neural-Guided Genetic Programming Population Seeding. NeurIPS 2021 Paper
- Landajuela et al. 2021 Improving exploration in policy gradient search: Application to symbolic optimization. Math-AI @ ICLR 2021. Paper
- Petersen et al. 2021 Incorporating domain knowledge into neural-guided search via in situ priors and constraints AutoML @ ICML 2021. Paper
- Kim et al. 2021 Distilling Wikipedia mathematical knowledge into neural network models. Math-AI @ ICLR 2021. Paper
- Kim et al. 2020 An interactive visualization platform for deep symbolic regression. IJCAI 2020. Paper
The core package has been tested on Python3.6+ on Unix and OSX. To install the core package (and the default regression
task), we highly recommend first creating a Python 3 virtual environment, e.g.
python3 -m venv venv3 # Create a Python 3 virtual environment
source venv3/bin/activate # Activate the virtual environment
Then, from the repository root:
pip install --upgrade setuptools pip
export CFLAGS="-I $(python -c "import numpy; print(numpy.get_include())") $CFLAGS" # Needed on Mac to prevent fatal error: 'numpy/arrayobject.h' file not found
pip install -e ./dso # Install DSO package and core dependencies
The regression
task is installed by default. It doesn't require any of the installation options below.
There are a few additional dependencies to run the control
task. Install them using:
pip install -e ./dso[control]
To install all dependencies for all tasks, use the all
option:
pip install -e ./dso[all]
DSO relies on configuring runs via a JSON file, then launching them via a simple command-line or a few lines of Python.
After creating your config file, simply run:
python -m dso.run path/to/config.json
After training, results are saved to a timestamped directory in the path given in the "logdir"
parameter (default ./log
).
The Python interface lets users instantiate and customize DSO models via Python scripts, an interactive Python shell, or an iPython notebook. The core DSO model is dso.core.DeepSymbolicOptimizer
. After creating your config file, you can use:
from dso import DeepSymbolicOptimizer
# Create and train the model
model = DeepSymbolicOptimizer("path/to/config.json")
model.train()
After training, results are saved to a timestamped directory in the path given in config["training"]["logdir"]
(default ./log
).
A single JSON file is used to configure each run. This file specifies the symbolic optimization task and all hyperparameters.
Each configuration JSON file has a number of top-level keys that control various parts of the DSO framework. The important top-level keys are:
"experiment"
configures the experiment, namely the log directory and random number seed."task"
configures the task, e.g. the dataset for symbolic regression, or the Gym environment for thecontrol
task. See below for task-specific configuration."training"
configures training hyperparameters like"n_samples"
(the total number of samples to generate) and"epsilon"
(the risk factor used by the risk-seeking policy gradient)."controller"
configures RNN hyperparameters like"learning_rate"
and"num_layers"
."prior"
configures the priors and constraints on the search space.
Any parameters not included in your config file assume default values found in config/config_common.json
, config/config_regression.json
(for regression
runs), and config/config_control.json
(for control
runs).
Here are simple example contents of a JSON file for the regression
task:
{
"task" : {
"task_type" : "regression",
"dataset" : "path/to/my_dataset.csv",
"function_set" : ["add", "sub", "mul", "div", "sin", "cos", "exp", "log"]
}
}
This configures DSO to learn symbolic expressions to fit your custom dataset, using the tokens specified in function_set
(see dso/functions.py
for a list of supported tokens).
If you want to include optimized floating-point constants in the search space, simply add "const"
to the function_set
list. Note that constant optimization uses an inner-optimization loop, which leads to much longer runtimes (~hours instead of ~minutes).
You can test symbolic regression out of the box with a default configuration, after running setup, with a command such as:
python -m dso.run dso/config/config_regression.json --b Nguyen-7
This will run DSO on the regression task with benchmark Nguyen-7.
Here's a simple example for the control
task:
{
"task" : {
"task_type" : "control",
"env" : "MountainCarContinuous-v0",
"function_set" : ["add", "sub", "mul", "div", "sin", "cos", "exp", "log", 1.0, 5.0, 10.0]
}
}
}
This configures DSO to learn a symbolic policy for MountainCarContinuous-v0, using the tokens specified in function_set
(see dso/functions.py
for a list of supported tokens).
For environments with multi-dimensional action spaces, DSO requires a pre-trained "anchor" policy. DSO is run once per action dimension, and the "action_spec"
parameter is updated each run. For an environment with N
action dimesions, "action_spec"
is a list of length N
. A single element should be null
, meaning that is the symbolic action to be learned. Any number of elements can be "anchor"
, meaning the anchor policy will determine those actions. Any number of elements can be expression traversals (e.g. ["add", "x1", "x2"]
), meaning that fixed symbolic policy will determine those actions.
Here's an example workflow for HopperBulletEnv-v0, which has three action dimensions. First, learn a symbolic policy for the first action by running DSO with a config like:
{
"task" : {
"task_type" : "control",
"name" : "HopperBulletEnv-v0",
"function_set" : ["add", "sub", "mul", "div", "sin", "cos", "exp", "log", 1.0, 5.0, 10.0],
"action_spec" : [null, "anchor", "anchor"],
"anchor" : "path/to/anchor.pkl"
}
}
}
where "path/to/anchor.pkl"
is a path to a stable_baselines
model. (The environments used in the ICML paper have default values for anchor
, so you do not have to specify one.) After running, let's say the best expression has traversal ["add", "x1", "x2"]
. To launch the second round of DSO, update the config's action_spec
to use the fixed symbolic policy for the first action, learn a symbolic policy for the second action, and use the anchor again for the third action:
"action_spec" : [["add", "x1", "x2"], null, "anchor"]
After running DSO, say the second action's traversal is ["div", "x3", "x4"]. Finally, update the action_spec
to:
"action_spec" : [["add", "x1", "x2"], ["div", "x3", "x4"], null]
and rerun DSO. The final result is a fully symbolic policy.
DSO can also be configured to learn a decision tree policy.
This is done by specifying decision_tree_threshold_set
in "task"
, which is a set of thresholds on the values of state variables when making a decision.
In particular, for each threshold tj
in decision_tree_threshold_set
, StateChecker
tokens xi < tj
for all
state variables xi
will be added to the Library
.
For example, for MountainCarContinuous-v0
, here is an example config:
{
"task" : {
"task_type" : "control",
"env" : "MountainCarContinuous-v0",
"function_set" : ["add", "sub", "mul", "div", "sin", "cos", "exp", "log", 1.0, 5.0, 10.0]
"decision_tree_threshold_set" : [-0.05, 0.0, 0.01]
}
}
Other than the functions specified in function_set
, this will also add x1 < -0.05
, x1 < 0.0
, x1 < 0.01
, x2 < -0.05
, x2 < 0.0
, and x2 < 0.01
to the Library
because MountainCarContinuous-v0
has two state variables.
With these StateChecker
tokens, decision tree policies like "if x1
< -0.05 and x2
< 0.0, the action is exp(x1) + 1.0
; otherwise, the action is sin(10 * x2)
" can be sampled.
To use the latest development from NeurIPS 2021, insert a field for "gp_meld"
. You can play with the different parameters. The most important part is to set "run_gp_meld"
to true.
{
"gp_meld" : {
"run_gp_meld" : true,
"verbose" : false,
"generations" : 20,
"p_crossover" : 0.5,
"p_mutate" : 0.5,
"tournament_size" : 5,
"train_n" : 50,
"mutate_tree_max" : 3,
"parallel_eval" : true
}
}
The regression
task supports an additional sklearn
-like regressor interface to make it easy to try out deep symbolic regression on your own data:
from dso import DeepSymbolicRegressor
# Generate some data
np.random.seed(0)
X = np.random.random((10, 2))
y = np.sin(X[:,0]) + X[:,1] ** 2
# Create the model
model = DeepSymbolicRegressor() # Alternatively, you can pass in your own config JSON path
# Fit the model
model.fit(X, y) # Should solve in ~10 seconds
# View the best expression
print(model.program_.pretty())
# Make predictions
model.predict(2 * X)
Each run of DSO saves a timestamped log directory in config["training"]["logdir"]
. Inside this directory is:
dso_ExperimentName_0.csv
: This file contains batch-wise summary statistics for each epoch. The suffix_0
means the random number seed was 0. (See "Advanced usage" for batch runs with multiple seeds.)dso_ExperimnetName_0_summary.csv
: This file contains summary statistics for the entire training run.dso_ExperimnetName_0_hof.csv
: This file contains statistics of the "hall of fame" (best sequences discovered during training). Edit `config["training"]["hof"] to set the number of hall-of-famers to record.dso_ExperimnetName_0_pf.csv
: This file contains statistics of the Pareto front of sequences discovered during training. This is a reward-complexity front.config.json
: This is a "dense" version of the configuration used for your run. It explicitly includes all parameters.
DSO's command-line interface supports a multiprocessing
-parallelized batch mode to run multiple tasks in parallel. This is recommended for large runs. Batch-mode DSO is launched with:
python -m dso.run path/to/config.json [--runs] [--n_cores_task] [--b] [--seed]
The option --runs
(default 1
) defines how many independent tasks (with different random number seeds) to perform. The regression
task is computationally expedient enough to run multiple tasks in parallel. For the control
task, we recommend running with the default --runs=1
.
The option --n_cores_task
(default 1
) defines how many parallel processes to use across the --runs
tasks. Each task is assigned a single core, so --n_cores_task
should be less than or equal to --runs
. (To use multiple cores within a single task, i.e. to parallelize reward computation, see the n_cores_batch
configuration parameter.)
The option --seed
, if provided, will override the parameter "seed"
in your config.
By default, DSO will use the task specification found in the configuration JSON. The option --b
(default None
) is used to specify the named task(s) via command-line. For example, --b=path/to/mydata.csv
runs DSO on the given dataset (regression
task), and --b=MountainCarContinuous-v0
runs the environment MountainCarContinuous-v0 (control
task). This is useful for running benchmark problems.
For example, to train 100 independent runs of DSR on the Nguyen-1 benchmark using 12 cores, using seeds 500 through 599:
python -m dso.run --b=Nguyen-1 --runs=100 --n_cores_task=12 --seed=500
In your config file, you can set "multiobject" : true
in the task
section when you are running on control
. This will completely ignore the action_spec
in the config file and will instead attempt to learn all of the action dimensions at the same time. We are still improving this feature, so right now it has the following limits:
- It is "all-or-nothing", meaning you cannot mix and match your
action_spec
to learn a subset of the actions together and use the anchor or symbolic actions for the rest. Specifically, you can't do"action_spec": [null, "anchor", null, "anchor"]
and just learn the twonull
actions simultaneously. Instead, the entireaction_spec
is ignored and all 4 actions are learned. We may add this capability in the future. - One controller sequentially samples all tokens for all actions. This means that one RNN is sampling each action in order. This seems like its not ideal, given that RNNs suffer from catastrophic forgetting when sequences grow long enough. In the future, we will work on solutions to this problem.
- When you are training in a 1 action environment and set
multiobject = True
,multiobject
is ignored and training progresses as normal.
DSO supports adding custom tasks and priors from your own modules.
To add new tasks, the task_type
keyword in the config file can be used in the following format: <module>.<source>:<function>
specifying the source implementing a make_task
function.
For example:
{
"task" : {
"task_type" : "custom_mod.my_source:make_task"
}
}
}
Similarly, new priors can be added by specifying the source where the Prior
class can be found in the prior
group of the config file.
For example:
"prior": {
"uniform_arity" : {
"on" : true
},
"custom_mod.my_source:CustomPrior" : {
"loc" : 10,
"scale" : 5,
"on" : true
}
}
To cite this work, please cite according to the most relevant tasks and/or methods.
To cite the regression
task, use:
@inproceedings{petersen2021deep,
title={Deep symbolic regression: Recovering mathematical expressions from data via risk-seeking policy gradients},
author={Petersen, Brenden K and Landajuela, Mikel and Mundhenk, T Nathan and Santiago, Claudio P and Kim, Soo K and Kim, Joanne T},
booktitle={Proc. of the International Conference on Learning Representations},
year={2021}
}
To cite the control
task, use:
@inproceedings{landajuela2021discovering,
title={Discovering symbolic policies with deep reinforcement learning},
author={Landajuela, Mikel and Petersen, Brenden K and Kim, Sookyung and Santiago, Claudio P and Glatt, Ruben and Mundhenk, Nathan and Pettit, Jacob F and Faissol, Daniel},
booktitle={International Conference on Machine Learning},
pages={5979--5989},
year={2021},
organization={PMLR}
}
To cite the neural-guided genetic programming population seeding method, use:
@inproceedings{mundhenk2021seeding,
title={Symbolic Regression via Neural-Guided Genetic Programming Population Seeding},
author={T. Nathan Mundhenk and Mikel Landajuela and Ruben Glatt and Claudio P. Santiago and Daniel M. Faissol and Brenden K. Petersen},
booktitle={35th Conference on Neural Information Processing Systems (NeurIPS 2021)},
year={2021}
}
LLNL-CODE-647188