EpistasisLab/tpot2

Make the cv early stop functions into standalone functions? Or remove?

perib opened this issue · 1 comments

perib commented

Currently, these are built into the evolver class. If we merge the revolver classes, this may be easier to use as a standalone function #65

The two strategies for pruning CV evaluation early are found in the evaluate_population_selection_early_stop function of the baseevolver. They are controlled by the following parameters, which may be confusing.

                    threshold_evaluation_early_stop = None, 
                    threshold_evaluation_scaling = .5,
                    min_history_threshold = 20,
                    selection_evaluation_early_stop = None,
                    selection_evaluation_scaling = .5,
                    evaluation_early_stop_steps = None, 
                    final_score_strategy = "mean",

It might be easier to pull these out and turn them into individual functions outside of the class, similar to the optuna pruning API.

But also, this feature may not be useful given that we already support successive halving. We could evaluate each independently and both together. If successive halving gives the same performance improvements, it is possible we could drop this feature to simplify the code.

perib commented

Another potentially useful cv pruning algorithm could be greedy k fold CV as described in this paper:

https://doaj.org/article/32043aab8bf946ec876db3013d500991 . This iteratively selects which fold to evaluate next. Loop through the algorithm until we have N individuals completed. Use those N and remove the rest.