MilesCranmer/SymbolicRegression.jl

[Feature]: Better behavior with unused variables

MilesCranmer opened this issue · 0 comments

Feature Request

It seems that PySR can struggle in the presence of dummy variables:

Screenshot 2024-08-12 at 22 01 45

^From a really nice paper from @yoshitomo-matsubara et al.

So this issue is set up to track this performance issue and see how we can eliminate it. I think the node-expansion branch will help a lot with this, once landed, although it won't be completely general to operators. So I wonder if there's a better way, like using a quick-and-dirty feature importance measure to select variables at each mutation.