/SymbolicRegression.jl

Distributed High-Performance symbolic regression in Julia

Primary LanguageJuliaApache License 2.0Apache-2.0

SymbolicRegression.jl

Latest release Documentation Build status Coverage
version Dev Stable CI Coverage Status

Distributed High-Performance symbolic regression in Julia.

Check out PySR for a Python frontend.

demo1 demo2

Cite this software

Quickstart

Install in Julia with:

using Pkg
Pkg.add("SymbolicRegression")

The heart of this package is the EquationSearch function, which takes a 2D array (shape [features, rows]) and attempts to model a 1D array (shape [rows]) using analytic functional forms.

Run distributed on four processes with:

using SymbolicRegression

X = randn(Float32, 5, 100)
y = 2 * cos.(X[4, :]) + X[1, :] .^ 2 .- 2

options = SymbolicRegression.Options(
    binary_operators=(+, *, /, -),
    unary_operators=(cos, exp),
    npopulations=20
)

hall_of_fame = EquationSearch(X, y, niterations=40, options=options, numprocs=4)

We can view the equations in the dominating Pareto frontier with:

dominating = calculate_pareto_frontier(X, y, hall_of_fame, options)

We can convert the best equation to SymbolicUtils.jl with the following function:

eqn = node_to_symbolic(dominating[end].tree, options)
println(simplify(eqn*5 + 3))

We can also print out the full pareto frontier like so:

println("Complexity\tMSE\tEquation")

for member in dominating
    complexity = compute_complexity(member.tree, options)
    loss = member.loss
    string = string_tree(member.tree, options)

    println("$(complexity)\t$(loss)\t$(string)")
end

Search options

See https://astroautomata.com/SymbolicRegression.jl/stable/api/#Options