Initializing with pre-specified population
charishma13 opened this issue · 5 comments
I would like to know how to initialize my population with n members which have pre-specified structure. For example, if i want my initiate population to have 15 members all of which have same expression eg: 1+x. Are there Pysr options to do it or is it something need to be updated. Thank you.
This feature does not yet exist, but it would certainly be nice to add it or simplify existing alternatives. The current strategy is basically to initialise the state manually. Alternatively you could run a search for 1 iteration, and then manipulate the saved state to specify individual members of the population. On the PySR discussions page there are some threads about this too.
Thank you for the suggestion @MilesCranmer. I will check the documentation and do the respective changes. I would also like to know in which Julia file does the actual initialization of population happens for every PySR iteration ?
The initialisation function is here: https://github.com/MilesCranmer/SymbolicRegression.jl/blob/master/src/Population.jl#L36-L62
which gets called here:
I am currently facing challenges in creating a custom saved_state. The saved_state is a tuple consisting of a population and a hall of fame object. I am in the process of developing a custom implementation for both the population and the hall of fame. To date, I have successfully created the PopMember component, following the guidance provided in the discussion available at MilesCranmer/PySR#443. I am attempting to create a population using PopMember instances, and I was considering calling the struct directly for this purpose. However, I am unsure if this approach will work as intended. I am encountering errors with the following code in highlighted line.
using .SymbolicRegression: Node, Options, equation_search, Dataset, PopMember, HallOfFame, Population
using CSV
using DataFrames
val = Node{Float64}(val=162.0)
xsi = Node{Float64}(val=1.224f0)
options = Options(binary_operators=[+, -, *, /])
csv_file_path = "water_water.csv"
data = CSV.File(csv_file_path) |> DataFrame
X1 = reshape(data."Angle", 1, :)
X2 = reshape(data."OH1", 1, :)
X3 = reshape(data."OH2", 1, :)
X4 = reshape(data."H1H2", 1, :)
X = [X1 X2 X3 X4]
X = reshape(X, 4, :)
y = data."Energy"
# Assuming y is your target variable
y_min = minimum(y)
y_scaled = (y .- y_min) * 2625.5002
dataset = Dataset(X, y_scaled)
# Format to PopMember:
member = PopMember(dataset, val, options; deterministic=false)
member1 = PopMember(dataset, xsi, options; deterministic=false)
>> population = Population{Float32, Float64, Node{Float32}}([member, member1], 2)
ERROR
ERROR: LoadError: TypeError: in Population, in L, expected L<:Real, got a value of type Float64
Stacktrace:
[1] top-level scope
@ ~/LU_Exp/popmembers_hof.jl:77
Hello @MilesCranmer,
I have managed to populate the Population using the following code: Population{Float64, Float64, Node{Float64}}([member for _ in 1:33],33)
I would like to inquire about where the initialization begins within the SymbolicRegression.jl framework, particularly with respect to functions such as _main_search_loop, _warmup_search, _initialize_search, and _create_workers. Would you please clarify which function is responsible for invoking the Population struct and initiating its initialization?.
Our intention is to modify the process starting from the initial population phase, allowing PySR to search for equations based on a predefined expression given using the saved_state. We have successfully implemented a custom saved_state
for equation search and are utilizing it in the equation search process. However, the hall of fame is initiating from our specified expression and is restarting the search from a complexity of 1. Could you please advice on how to use saved_state so that the equation search starts from our defined expression.