progress computation in discretized interest model

Question

progress computation in discretized interest model

Closed this issue 8 years ago · 5 comments

sebastien-forestier commented 9 years ago

I was wondering if the computation of progress in the discretized interest model is the intended one here.
Indeed, the np.cov is computing the covariance between [1, 2, 3, 4, 5] and the last competences e.g. [0.8, 0.6, 0.7, 0.9, 0.95].
An issue with that behavior is at the beginning of the exploration when few cells are sampled and few points are sampled in those cells.
The np.cov is in fact increasing even if progress is constant:

np.cov(range(3), [0.5, 0.6, 0.7])[0,1] = 0.099
np.cov(range(4), [0.5, 0.6, 0.7, 0.8])[0,1] = 0.166
np.cov(range(5), [0.5, 0.6, 0.7, 0.8, 0.9])[0,1] = 0.25

which leads to the exploration of n points in each randomly chosen cell with n corresponding roughly to the window's size.
That can be seen for instance in the notebook about curiosity in the scatter plots.

I tried another behavior in another branch here where the initial guessed progress of each unsampled cell is a constant (e.g. 10), and the progress is computed as the mean of the 5 last competencecs in the cell minus the mean of the 10 to 5 last ones.
In that case, there is an exploration biais that pushes to explore at least once in each cell before using progress based sampling.

What do you think ?
@clement-moulin-frier
@pierre-rouanet

Answer 1 · 2016-01-04T13:42:09.000Z

Hi Sebastien, yes, what you propose is what we were doing in older Matlab code when regions were pre-determided.
However, when regions are dynamically built (like in IAC/RIAC/SAGG-RIAC), the initial progress value of a new region was either

a new value computed on the history of points of the mother region falling in this region if there is enough points to compute a meaningful progress
the same value of progress as the mother region if there are not enough points (and a constant value if the mother region did not already have a progress measure)

Le 29 déc. 2015 à 10:16, Sébastien Forestier notifications@github.com a écrit :

I was wondering if the computation of progress in the discretized interest model is the intended one here.
Indeed, the np.cov is computing the covariance between [1, 2, 3, 4, 5] and the last competences e.g. [0.8, 0.6, 0.7, 0.9, 0.95].
An issue with that behavior is at the beginning of the exploration when few cells are sampled and few points are sampled in those cells.
The np.cov is in fact increasing even if progress is constant:

np.cov(range(3), [0.5, 0.6, 0.7])[0,1] = 0.099
np.cov(range(4), [0.5, 0.6, 0.7, 0.8])[0,1] = 0.166
np.cov(range(5), [0.5, 0.6, 0.7, 0.8, 0.9])[0,1] = 0.25

which leads to the exploration of n points in each randomly chosen cell with n corresponding roughly to the window's size.
That can be seen for instance in the notebook about curiosity in the scatter plots.

I tried another behavior in another branch here where the initial guessed progress of each unsampled cell is a constant (e.g. 10), and the progress is computed as the mean of the 5 last competencecs in the cell minus the mean of the 10 to 5 last ones.
In that case, there is an exploration biais that pushes to explore at least once in each cell before using progress based sampling.

What do you think ?
@clement-moulin-frier
@pierre-rouanet

—
Reply to this email directly or view it on GitHub.

Answer 2 · 2016-01-11T15:46:21.000Z

What you propose is fine to me (and important since the whole exploration history is very dependent to the first events). Thx

Answer 3 · 2016-01-11T16:00:33.000Z

Ok, but did you have arguments or intuitions for this behavior that force to sample a few points, or should I implement a behavior really computing a derivative, as the current behavior might seem strange to people using it ?
For the derivative behavior, there still is the question of the guessed initial progress.
If it is high, all regions will be sampled before a discrimination based on real progress will take place (which is I think a good way to do if not too much cells are defined).
If it is low, interesting regions might be explored very lately.
As a reminder, in both implementations there is a sofmax smoothing to choose the exploring region.

Answer 4 · 2016-01-11T18:29:30.000Z

This issue is actually what initially led me to propose a recursive region splitting mechanisms: initially you have very few cells,
and they multiply only in interesting areas. Having too many cells initially makes the idea of learning progress as bas as searching
for novelty I think (so maybe this is not a big problem in the library, we could have a tutorial comparing having many cells versus
recursive region splitting).

Le 11 janv. 2016 à 17:00, Sébastien Forestier notifications@github.com a écrit :

Ok, but did you have arguments or intuitions for this behavior that force to sample a few points, or should I implement a behavior really computing a derivative, as the current behavior might seem strange to people using it ?
For the derivative behavior, there still is the question of the guessed initial progress.
If it is high, all regions will be sampled before a discrimination based on real progress will take place (which is I think a good way to do if not too much cells are defined).
If it is low, interesting regions might be explored very lately.
As a reminder, in both implementations there is a sofmax smoothing to choose the exploring region.

—
Reply to this email directly or view it on GitHub.

Answer 5 · 2016-06-29T15:31:06.000Z

I implemented this progress computation in DiscreteProgress #70.