Statistical problems: low-weight elements not considered evenly
robbat2 opened this issue · 0 comments
robbat2 commented
I think there is a statistical problem with your code.
Weights [1,1,1,1,9996]
Picking unique items repeatedly, the equal low-weight elements should come up with equal probability.
Changing the random seed
Output:
Data:
{:a=>1, :b=>1, :c=>1, :d=>1, :e=>9996}
Stats (instances):
{:a=>14.605853941460586,
:b=>20.830541694583054,
:c=>17.70882291177088,
:d=>21.854781452185478,
:e=>25.0}
Stats (counts):
{:a=>58.424, :b=>83.323, :c=>70.836, :d=>87.42, :e=>100.001}
Sample code:
Random.srand(1)
require 'pickup'
data = {
:a => 1,
:b => 1,
:c => 1,
:d => 1,
:e => 10**4-4,
}
stats = data.keys.map { |ad| [ad, 0] }.to_h
pickup = Pickup.new(data, uniq: true)
TESTS = 10*data.values.inject(0, :+)
for i in 0..TESTS do
picks = pickup.pick(4)
picks.each { |ad| stats[ad] += 1 }
end
total_appearences = stats.values.inject(0, :+)
fractional_stats_instances = stats.to_a.map { |a| [a[0], 100.0 * a[1]/total_appearences] }.to_h
fractional_stats_counts = stats.to_a.map { |a| [a[0], 100.0 * a[1]/TESTS] }.to_h
print("Stats (instances):\n")
pp(fractional_stats_instances)
print("Stats (counts):\n")
pp(fractional_stats_counts)