kiudee/chess-tuning-tools

Variance Reduction does not sample from the ends of the intervals.

Claes1981 opened this issue · 9 comments

Description

Describe what you were trying to get done.

Run Tuner with Variance Reduction acquisition function to explore the whole landscape and thereby also include as large range of scores/y-values as possible.

Tell us what happened, what went wrong, and what you expected to happen.

The tuner never, 26 iterations so far, samples from the ends of the intervals, except for the first 5 n_initial_points iterations, even if no or almost no samples has been evaluated there earlier. At least with Variance Reduction I expect it with high probability to sample from those unexplored areas.

Plots of samples, parameter values on x-axes. The x-axes covers the whole ranges as specified in the json config. (You can find the exact values in the log, at "Got the following tuning settings:".) Iterations 16-25 was only with Variance Reduction acquisition function, and iterations 5-15 was a mix of Predictive Variance Reduction Search and Variance Reduction. (Scores on y-axes versus parameter values should be taken with a large grain of salt, since the other 3 parameter values are not shown in each plot respectively.) The table shows the values of iteration 0 to 25 in order.:
Sample_ranges

10 rounds/20 games per iteration.
no opening book. (I am still thinking of how to best implement a book only for engine2.)
One out of four different opponents chosen at random each round.

Data:
Cfish-20100303_pgo_extra_6t_h2048_4opponents_10rounds.npz.zip

Log:
Cfish-20100303_pgo_extra_6t_h2048_4opponents_10rounds.log

Here is a plot at iteration 26 with no re-initialization (only fast-resume) since iteration 12:
20210816-055654-26

It is indeed quite a counter-intuitive property of variance reduction (VR) to favor points in the center. This is due to it trying to pick points which reduce the predictive variance of all of the other points.

If you want to have a criterion which picks the single point with the highest uncertainty, then use "lcb" in conjunction with:

acq_func_kwargs=dict(alpha=1.96, n_thompson=500),

alpha set to "inf".

I am planning to make these parameters configurable.

Ah, thank you very much for the explanation. Yes, it would be interesting to see what "lcb" could do.

Ah, thank you very much for the explanation. Yes, it would be interesting to see what "lcb" could do.

I only recently started to use lcb a lot, since it is able to correct a bad model fit. I am thinking about implementing something like acquisition function mixtures like

"acq_function": "{0.9: pvrs(n_thompson=1000), 0.05: lcb(alpha='inf'), 0.05: mean}",

where it would randomly choose between several acquisition functions according to the weights.

Yes, something similar to the upstream gp_hedge. I have had about the same thought, but have not had the time to try to implement it yet. I was to begin with, only thinking about try to code some simple random choice between pvrs, mes, and ts, though. Your template seems to offer a lot more advanced fine tuning of the acquisition.

Until then, it seems there is no check currently in the Chess Tuning Tools code against passing "lcb" as an acquisition function? Since "lcb" should be supported by Bayes-skopt, I guess I can just try to pass "lcb" to the --acq-function option? And change alpha directly in the code.

Yes, you can pass in "lcb", but to get "pure exploration" you also need to set alpha="inf" which is currently hardcoded to 1.96.

Thanks, I will try to set alpha to "inf" on the line you showed above, while exploring the landscape with "lcb".

I created a command line option to set the lcb alpha parameter:
Claes1981@1e6cc8a

(Or more accurateĺy: I mostly copy-pasted your code with acquisition_function_samples.)

Actually I have a problem passing inf OR "inf" on the command line. In both cases it seems this line does not become true. (I placed a breakpoint below the line which never triggered, but printing alpha on the line above outputs inf in both cases.)
I wonder if it has something to do with alpha being a float but might be compared to a string on the line...

Update: Worked around it now by commit Claes1981@0dad57e, it seems to be working now.

Yes, something similar to the upstream gp_hedge. I have had about the same thought, but have not had the time to try to implement it yet. I was to begin with, only thinking about try to code some simple random choice between pvrs, mes, and ts, though. Your template seems to offer a lot more advanced fine tuning of the acquisition.

I made a simple random selection of acquisition function each iteration possible:
Claes1981@59b9258
Hoping it somehow can combine all good parts of the different acquisition functions at once. (Possibly it also combines the bad parts of each acquisition function...)