fl00r/pickup

Statistical problems: low-weight elements not considered evenly

robbat2 opened this issue · 0 comments

I think there is a statistical problem with your code.

Weights [1,1,1,1,9996]
Picking unique items repeatedly, the equal low-weight elements should come up with equal probability.

Changing the random seed
Output:

Data:
{:a=>1, :b=>1, :c=>1, :d=>1, :e=>9996}
Stats (instances):
{:a=>14.605853941460586,
 :b=>20.830541694583054,
 :c=>17.70882291177088,
 :d=>21.854781452185478,
 :e=>25.0}
Stats (counts):
{:a=>58.424, :b=>83.323, :c=>70.836, :d=>87.42, :e=>100.001}

Sample code:

Random.srand(1)
require 'pickup'
data = { 
    :a => 1,
    :b => 1,
    :c => 1,
    :d => 1,
    :e => 10**4-4,
}
stats = data.keys.map { |ad| [ad, 0] }.to_h
pickup = Pickup.new(data, uniq: true)
TESTS = 10*data.values.inject(0, :+) 
for i in 0..TESTS do
    picks = pickup.pick(4)
    picks.each { |ad| stats[ad] += 1 } 
end
total_appearences = stats.values.inject(0, :+) 
fractional_stats_instances = stats.to_a.map { |a| [a[0], 100.0 * a[1]/total_appearences] }.to_h
fractional_stats_counts = stats.to_a.map { |a| [a[0], 100.0 * a[1]/TESTS] }.to_h
print("Stats (instances):\n")
pp(fractional_stats_instances)
print("Stats (counts):\n")
pp(fractional_stats_counts)