by Charlton Rose
We’re pregnant, and we don’t know the gender of our baby (codenamed “Sharky 2.1”). However, a bag of coins, assembled by an accomplice, does. Inside the bag are 2 coins representing the correct gender and 1 coin representing the wrong gender.1 Each night, we draw a coin from the bag at random, learn its identity,2 and return it to the bag. As a result, each night, we have a slightly better clue about the gender of our child – but we may never be 100% sure.
We wondered what we can assert, statistically, about the probability of our child being a certain gender, based on our draw history. To satisfy this curiosity, I constructed the following derivation. This is our “gender estimation function.”
Consider the following events:
= | It’s a boy! | |
= | It’s a girl! | |
= | A random drawing of coins, with replacement, produces boy coins and girl coins. |
If we use the notation to indicate the probability that event occurs, then we can express
= | the probability it's a boy | |
= | the probability it's a girl | |
= | the probability that a random drawing of coins, with replacement, produces boy coins and girl coins |
Another useful notation, denotes the probability that occurs, given that occurs. Thus, as we successively draw coins, we are interested in determining
= | the probability it's a boy, given that when we draw coins, we draw boy coins and girl coins |
Lest we cause a misunderstanding that we are boy-focused, we proclaim that we are also aware that
based on a crazy notion3 that .
A useful theorem, known as Bayes’ theorem, proposes that4
Thus, we can say
We’re going to tackle this formula using two somewhat tricky substitutions.
is difficult to evaluate directly, because we don’t know the distribution of coins in the purse. However, to our rescue comes the law of total probability, which tells us
This is true as long as we know that .5 Substituting, then, we can arrive at
Now, if we are willing to assume that , we can simplify this to
Next, can be resolved by observing that is a binomial distribution.
A “binomial distribution is the discrete probability distribution of the number of successes in a sequence of independent yes/no experiments, each of which yields success with probability .”6 The probability mass function for a binomial distribution is
where
is the variable that follows a binomial distribution and indicates the number of successes, | |
is the exact number of successes sought, | |
is the number of trials, and | |
is the probability of success. |
The notation is called the "binomial coefficient." It is read, " choose ," and is evaluated as , but this detail won’t matter by the time we’re done.
In our application, we can define as the successful event,7 and then note that
Thus, we can infer that
Using similar logic, we can also infer the other side of the coin (so to speak):
Substituting both of these expressions into our current expression for , we get
By letting the binomial coefficient cancel out, we can simplify this to
Further simplification gives us
Now we have everything we need to determine gender probabilities based on our growing tally of observed boy coins and girl coins. As the total number of draws grows large, it is reasonable to expect that or . When these ratios hold, the value of asymptotically approaches 1.0 and 0.0, respectively – suggesting that a near perfect certainty about our baby’s gender will develop over time. Certainly, after 9 months of drawing, we’ll know for sure.
© 2013 Charlton Rose. All rights reserved.