Skipping on powers of 2
Timeroot opened this issue · 5 comments
I believe your "skip" function should be altered to actually skip one less than it currently does. As a simple benchmark, if I take a vector of floats and convert each to boolean by testing with >=0.5, I would like it to be the case that a set of 2^n samples covers each binary vector exactly once. This works if I skip 2^n - 1
times:
s = SobolSeq(4); for u=1 : 15;next!(s);end
sort(map(bs ->
sum(map(x->x[2] >= 0.5 ? x[1] : 0, zip(2 .^ [0 1 2 3], bs))),
map(x->next!(s), 1:16)
))
returns all numbers from 0 to 15 exactly once. If I instead run
s = SobolSeq(4);for u=1:16;next!(s);end
sort(map(bs ->
sum(map(x->x[2] >= 0.5 ? x[1] : 0, zip(2 .^ [0 1 2 3], bs))),
map(x->next!(s), 1:16)
))
then I get a repeat of 12, and no instance of 8. But skipping the 2^n - 1 seems to produce correct results for all cases I tried.
The advice of "skip the largest power of 2 below your desired one" seems quite bad, actually. Suppose I only want to produce 15 values of 4-bits -- I would hope to get no repeats, and one missing; maybe something less perfect than that. But it's actually quite terrible!
julia> s = SobolSeq(4);skip(s,15)
julia> sort(map(bs ->
sum(map(x->x[2] >= 0.5 ? x[1] : 0, zip(2 .^ [0 1 2 3], bs))),
map(x->next!(s), 1:15)
))
15-element Array{Int64,1}:
4
5
5
6
6
7
7
8
8
9
9
10
10
11
11
That is to say, every vector produced has (bit 2 XOR bit 3) true! That should only be true in half of them! This is the result of only skipping 8. If you skip 15, as I suggest, then you get better results. (And, again: this seems true at various higher numbers as well.)
Sounds good to me.
Joe & Kuo have some later (2008) notes which say:
I'm suspect that there was better advice at some point along the line that got garbled/oversimplified?