MilesCranmer/PySR

[BUG]: Can't pickle greater: attribute lookup greater on __main__ failed

tbuckworth opened this issue · 3 comments

What happened?

model.fit fails due to pickle error when using binary operator "greater".

Here is a minimal example:

import numpy as np
from pysr import PySRRegressor

if __name__ == "__main__":
    x = np.random.uniform(-1, 1, size=100).reshape((50, 2))
    y = x[:, 1] ** 2
    model = PySRRegressor(
        equation_file="symbreg/symbreg.csv",
        niterations=1,
        binary_operators=["greater"],
        elementwise_loss="loss(prediction, target) = (prediction - target)^2",
    )
    model.fit(x, y)

If I replace "greater" with "cond" then no error is thrown. I've tried on different datastets etc., but if 'greater' is used in an equation, then this error is thrown.

Python=3.8
pysr=0.17.2

Version

0.17.2

Operating System

Linux

Package Manager

pip

Interface

Script (i.e., python my_script.py)

Relevant log output

Compiling Julia backend...
[ Info: Started!
0.0%┣                                               ┫ 0/15 [00:00<00:00, -0s/it]Expressions evaluated per second: [.....]. Head worker occupation: 0.0%         Press 'q' and then <enter> to stop execution early.                             Hall of Fame:                                                                   ---------------------------------------------------------------------------------------------------                                                             Complexity  Loss       Score     Equation                                       1           8.819e-02  1.594e+01  y = 0.34869                                   ---------------------------------------------------------------------------------------------------
20.0%┣█████████▏                                    ┫ 3/15 [00:00<00:01, 15it/s]Expressions evaluated per second: [.....]. Head worker occupation: 58.2%. This is high, and will prevent efficient resource usage. Increase `ncyclesperiteration` to reduce load on head worker.                                                Press 'q' and then <enter> to stop execution early.                             Hall of Fame:                                                                   ---------------------------------------------------------------------------------------------------                                                             Complexity  Loss       Score     Equation                                       1           8.819e-02  1.594e+01  y = 0.34869                                   9           6.944e-02  2.989e-02  y = greater(0.7405, greater(greater(0.7405, x₁), greater(-0.71...                                                                                               281, x₁)))                                    19          6.928e-02  2.238e-04  y = greater(greater(greater(greater(-0.40062, -1.2073), greate...                                                                                               r(x₀, x₀)), 0.7405), greater(greater(0.7405, greater(x₁, 0.695...                                                                                               09)), greater(-0.71281, x₁)))                 ---------------------------------------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/titus/PycharmProjects/train-procgen-pytorch/venv/lib/python3.8/site-packages/pysr/sr.py", line 1112, in _checkpoint
    pkl.dump(self, f)
_pickle.PicklingError: Can't pickle greater: attribute lookup greater on __main__ failed

Extra Info

Someone here fixed a similar issue with this help:

The problem is that you're trying to pickle an object from the module where it's defined. If you move the Nation class into a separate file and import it into your script, then it should work.

I updated to pysr==0.18.1, but the problem persists

That is weird, it seems like greater is missing its sympy mapping:

sympy_mappings = {
"div": lambda x, y: x / y,
"mult": lambda x, y: x * y,
"sqrt": lambda x: sympy.sqrt(x),
"sqrt_abs": lambda x: sympy.sqrt(abs(x)),
"square": lambda x: x**2,
"cube": lambda x: x**3,
"plus": lambda x, y: x + y,
"sub": lambda x, y: x - y,
"neg": lambda x: -x,
"pow": lambda x, y: x**y,
"pow_abs": lambda x, y: abs(x) ** y,
"cos": sympy.cos,
"sin": sympy.sin,
"tan": sympy.tan,
"cosh": sympy.cosh,
"sinh": sympy.sinh,
"tanh": sympy.tanh,
"exp": sympy.exp,
"acos": sympy.acos,
"asin": sympy.asin,
"atan": sympy.atan,
"acosh": lambda x: sympy.acosh(x),
"acosh_abs": lambda x: sympy.acosh(abs(x) + 1),
"asinh": sympy.asinh,
"atanh": lambda x: sympy.atanh(sympy.Mod(x + 1, 2) - 1),
"atanh_clip": lambda x: sympy.atanh(sympy.Mod(x + 1, 2) - 1),
"abs": abs,
"mod": sympy.Mod,
"erf": sympy.erf,
"erfc": sympy.erfc,
"log": lambda x: sympy.log(x),
"log10": lambda x: sympy.log(x, 10),
"log2": lambda x: sympy.log(x, 2),
"log1p": lambda x: sympy.log(x + 1),
"log_abs": lambda x: sympy.log(abs(x)),
"log10_abs": lambda x: sympy.log(abs(x), 10),
"log2_abs": lambda x: sympy.log(abs(x), 2),
"log1p_abs": lambda x: sympy.log(abs(x) + 1),
"floor": sympy.floor,
"ceil": sympy.ceiling,
"sign": sympy.sign,
"gamma": sympy.gamma,
"round": lambda x: sympy.ceiling(x - 0.5),
"max": lambda x, y: sympy.Piecewise((y, x < y), (x, True)),
"min": lambda x, y: sympy.Piecewise((x, x < y), (y, True)),
"cond": lambda x, y: sympy.Piecewise((y, x > 0), (0.0, True)),
"logical_or": lambda x, y: sympy.Piecewise((1.0, (x > 0) | (y > 0)), (0.0, True)),
"logical_and": lambda x, y: sympy.Piecewise((1.0, (x > 0) & (y > 0)), (0.0, True)),
"relu": lambda x: sympy.Piecewise((0.0, x < 0), (x, True)),
}

So you could pass this to extra_sympy_mappings of the PySRRegressor, like

extra_sympy_mappings={"greater": lambda x, y: sympy.Piecewise((1.0, x > y), (0.0, True))}

but ideally we should have it built-in since greater is documented as an available operator.

Brilliant! That fixed it, thank you