These are some models I made for the coin flip cheaters
There are two types of coins in the problem.
- fair coin (probability of heads is 1/2)
- biased coin (probability of heads is 3/4)
Using Bayes' theorem, one can find the probability of the coin to be fair.
Let
where
If we label right, we get 1 point and 15 flipping chances. Otherwise we lose 30 flipping chances. Let's consider only the flipping chances. If we flip once, we lose 1 flipping chance. We may gain or lose flipping chances by labeling. Thus, we can evaluate the labeling process by below formula.
For my model, I used
From above, we can calculate the expected reward. If there is a way that maximizes the expected reward, we can take that as a strategy.
The no belief model tests a coin until the 'generally' expected reward is at its maximum. What I meant by the term generally is that it does not consider how many times we got heads.
Suppose that
For
The weak belief model tests a coin while the expected reward increases. Which means that we compare below two values.
- $$ \mathbb{E}[reward(n, x)] $$
- $$ P(\text{The next coin is head}) \times reward(n + 1, x + 1) + P(\text{The next coin is tail}) \times reward(n + 1, x) $$
If the first value is greater than the second value, the reward expected to be decreased, thus we label the coin instantly. Otherwise, the reward expected to be increased, thus we flip once more.
To calculate the second value, we use the law of total probability, and use
We can calculate
The fanatic model is similar to the weak belief model, except that it considers the 'current' label. As mentioned above, the coin can be labeled by using
This model strongly depends on the posterior observation data, not considering the prior probability.
The belief model is also simliar to the weak belief model, except that it considers
There is huge uncertainty whether the coin is fair or biased, if the model met the end condition for the first time. Therefore calm models do not end testing if the end condition (the reward is expected to be decreased) is first met. When the end condition is again met, they end testing. If the remaining fund is 0 or it is 14th flip, it ends testing, since there is no more chances to flip.
If there is huge uncertainty, it would be better to minimize the uncertainty. Thus, indecisive models uses significance level to label the coin. If fairness of coin is enoughly small or large that the model is certain that it is fair or biased, it label the coin.
As mentioned above, indecisive model uses significance level to check the end condition. Let
The model is certain that the coin is either fair or biased. Empirically significant level of 0.3 worked fine.
For computing expected next reward, other than considering only for the next step, indecisive models consider every combinations that can occur. When considering every combinations, it computes expected reward if given fairness(n + k, x + l) (k for future flips, l for future heads) is signficant, and search further if given fairness is not significant.
Below is the cheat sheet built using fanatic model. F for Fair, C for Cheat, T for further testing.
T | |||||||||||||||||
F | T | ||||||||||||||||
F | T | C | |||||||||||||||
F | F | T | C | ||||||||||||||
F | F | F | T | C | |||||||||||||
F | F | F | T | C | C | ||||||||||||
F | F | F | F | T | C | C | |||||||||||
F | F | F | F | T | T | C | C | ||||||||||
F | F | F | F | F | T | C | C | C | |||||||||
F | F | F | F | F | F | T | C | C | C | ||||||||
F | F | F | F | F | F | T | C | C | C | C | |||||||
F | F | F | F | F | F | F | T | C | C | C | C | ||||||
F | F | F | F | F | F | F | F | T | C | C | C | C | |||||
F | F | F | F | F | F | F | F | T | C | C | C | C | C | ||||
F | F | F | F | F | F | F | F | F | C | C | C | C | C | C | |||
F | F | F | F | F | F | F | F | F | F | C | C | C | C | C | C |
Below is the cheat sheet built using basic indecisive model. F for Fair, C for Cheat, T for further testing.
T | |||||||||||||||||
T | T | ||||||||||||||||
F | T | T | |||||||||||||||
F | F | T | C | ||||||||||||||
F | F | T | T | C | |||||||||||||
F | F | F | T | C | C | ||||||||||||
F | F | F | F | T | C | C | |||||||||||
F | F | F | F | T | T | C | C | ||||||||||
F | F | F | F | F | T | C | C | C | |||||||||
F | F | F | F | F | T | T | C | C | C | ||||||||
F | F | F | F | F | F | T | T | C | C | C | |||||||
F | F | F | F | F | F | F | T | C | C | C | C | ||||||
F | F | F | F | F | F | F | T | T | C | C | C | C | |||||
F | F | F | F | F | F | F | F | T | C | C | C | C | C | ||||
F | F | F | F | F | F | F | F | F | C | C | C | C | C | C | |||
F | F | F | F | F | F | F | F | F | F | C | C | C | C | C | C |
By the way, the game is well designed. The 'generally' expected rewards are calculated as below.
The expected rewards are all negative, thus somehow the game would end eventually. If we modify
Some expected rewards are positive. If we do only few tests, the flipping chance is expected to be gained. Therefore, the game would be super long, or even endless, and become boring. Fix
Again, some expected rewards are positive.
Still, we can modify some parameters to make different games. We can lower