“Detecting Coin Cheaters in the Game of Blobs: Evaluating a Frequentist Hypothesis Test”

- Frequentist hypothesis testing is a method of making decisions with limited data
- This video explores how it can be used to build a test for detecting coin cheaters in the game of blobs
- The goals for the test are: 1) low chance of wrongly accusing fair players, 2) high chance of catching cheaters, 3) using the smallest number of coins possible
- The test that was designed was to accuse someone if they got five out of five heads
- When tested on a sample set, less than 5% of fair players were wrongly accused and only a few cheaters were caught.

Unfair Coins and the Binomial Distribution

- The binomial distribution is a formula used to calculate the probability of various outcomes when flipping a coin with different numbers of flips
- The formula follows the same pattern for two flips and three flips
- When given an unfair coin that is favorited towards heads, this changes the probabilities and skews them towards more heads
- The binomial distribution can be displayed on bar graphs, and can be represented as a test rule where a player is accused if they get five out of five heads.

“Achieving the Perfect Balance: Learn How to Catch Cheaters in Online Games”

- This video explains a test that can detect cheaters in an online game. The goal is to ensure less than 5% of innocent players are accused and at least 80% of cheaters are caught
- The results depend on which trade-offs the user is willing to make. To meet both goals, a blob must flip a coin 23 times and get 16 or more heads
- Lower thresholds result in false positive rates higher than 5% or false negative rates lower than 80%. With this test, 99% of cheaters will get 17 or more heads.

Uncovering the Flaws of Frequentist Hypothesis Testing: An Unfair Coin Test Case

- Frequentist hypothesis testing is a framework used to answer yes/no questions
- This framework requires a model of both possible results of the question, a test to tell those two possibilities apart, and enough data to make an accurate conclusion
- In this example, a model was created around two kinds of coins: one fair, and one unfair
- It was assumed that the cheater coins would come up heads 75% of the time, but in reality they only came up heads 60% of the time
- The test designed was not optimized for this situation so it only caught about a quarter of the cheaters and accused less than 5% of innocent players
- Even though this initial test failed, frequentist hypothesis testing is still used in real scientific studies.