# the “Sharky 2.1” gender estimation function

by Charlton Rose

Some people can't wait to learn the gender of their unborn child. Sharky, however, appreciates the thrill of delayed gratification. So, rather than discovering the gender in a single, revelatory moment, he invented a game that allowed him and his wife to discover, with incremental levels of certainty, the gender of their child. This document describes the insane process.

We’re pregnant, and we don’t know the gender of our baby (codenamed “Sharky 2.1”). However, a bag of coins, assembled by an accomplice, does. Inside the bag are 2 coins representing the correct gender and 1 coin representing the wrong gender.1 Each night, we draw a coin from the bag at random, learn its identity,2 and return it to the bag. As a result, each night, we have a slightly better clue about the gender of our child – but we may never be 100% sure.

We wondered what we can assert, statistically, about the probability of our child being a certain gender, based on our draw history. To satisfy this curiosity, I constructed the following derivation. This is our “gender estimation function.”

Consider the following events:

 $B$ = It’s a boy! $G$ = It’s a girl! ${S}_{b,g}$ = A random drawing of $b+g$ coins, with replacement, produces $b$ boy coins and $g$ girl coins.

If we use the notation $P\left(X\right)$ to indicate the probability that event $X$ occurs, then we can express

 $P\left(B\right)$ = the probability it's a boy $P\left(G\right)$ = the probability it's a girl $P\left({S}_{b,g}\right)$ = the probability that a random drawing of $b+g$ coins, with replacement, produces $b$ boy coins and $g$ girl coins

Another useful notation, $P\left(X|Y\right)$ denotes the probability that $X$ occurs, given that $Y$ occurs. Thus, as we successively draw coins, we are interested in determining

 $P\left(B|{S}_{b,g}\right)$ = the probability it's a boy, given that when we draw $b+g$ coins, we draw $b$ boy coins and $g$ girl coins

Lest we cause a misunderstanding that we are boy-focused, we proclaim that we are also aware that

$P G S b , g = 1 - P B S b , g$

based on a crazy notion3 that $P\left(B\right)+P\left(G\right)=1$.

A useful theorem, known as Bayes’ theorem, proposes that4

$P X Y = P Y X · P X P Y$

Thus, we can say

$P B S b , g = P S b , g B · P B P S b , g$

We’re going to tackle this formula using two somewhat tricky substitutions.

1. $P\left({S}_{b,g}\right)$ is difficult to evaluate directly, because we don’t know the distribution of coins in the purse. However, to our rescue comes the law of total probability, which tells us

$P S b , g = P B · P S b , g B + P G · P S b , g G$

This is true as long as we know that $P\left(B\right)+P\left(G\right)=1$.5 Substituting, then, we can arrive at

$P B S b , g = P S b , g B · P B P B · P S b , g B + P G · P S b , g G$

Now, if we are willing to assume that $P\left(B\right)=P\left(G\right)$, we can simplify this to

$P B S b , g = P S b , g B · P B P B · P S b , g B + P B · P S b , g G = P S b , g B P S b , g B + P S b , g G$
2. Next, $P\left({S}_{b,g}|B\right)$ can be resolved by observing that ${S}_{b,g}$ is a binomial distribution.

A “binomial distribution is the discrete probability distribution of the number of successes in a sequence of $n$ independent yes/no experiments, each of which yields success with probability $p$.”6 The probability mass function for a binomial distribution is

$P X = k = n k p k 1 - p n - k$

where

 $X$ is the variable that follows a binomial distribution and indicates the number of successes, $k$ is the exact number of successes sought, $n$ is the number of trials, and $p$ is the probability of success.

The notation $\left(\genfrac{}{}{0}{}{n}{k}\right)$ is called the "binomial coefficient." It is read, "$n$ choose $k$," and is evaluated as $\frac{n!}{k!\left(n-k\right)!}$, but this detail won’t matter by the time we’re done.

In our application, we can define $B$ as the successful event,7 and then note that

• $k=b$,
• $n=b+g$, and
• $p=\frac{2}{3}$ when solving for $P\left({S}_{b,g}|B\right)$ (since $B$ is given).

Thus, we can infer that

$P S b , g B = b + g b 2 3 b 1 - 2 3 b + g - b = b + g b 2 3 b 1 3 g$

Using similar logic, we can also infer the other side of the coin (so to speak):

$P S b , g G = b + g b 1 3 b 2 3 g$

Substituting both of these expressions into our current expression for $P\left(B|{S}_{b,g}\right)$, we get

$P B S b , g = b + g b 2 3 b 1 3 g b + g b 2 3 b 1 3 g + b + g b 1 3 b 2 3 g$

By letting the binomial coefficient $\left(\genfrac{}{}{0}{}{b+g}{b}\right)$ cancel out, we can simplify this to

$P B S b , g = 2 3 b 1 3 g 2 3 b 1 3 g + 1 3 b 2 3 g$

Further simplification gives us

$P B S b , g = 2 b 2 b + 2 g$

Now we have everything we need to determine gender probabilities based on our growing tally of observed boy coins and girl coins. As the total number of draws grows large, it is reasonable to expect that $b\approx 2g$ or $g\approx 2b$. When these ratios hold, the value of $P\left(B|{S}_{b,g}\right)$ asymptotically approaches 1.0 and 0.0, respectively – suggesting that a near perfect certainty about our baby’s gender will develop over time. Certainly, after 9 months of drawing, we’ll know for sure.

1. 1 The coins are \$1 coins, identical in every respect except for the printed year. The year indicates the gender.
2. 2 Actually, we are letting the poor employees at Wendy’s, Arctic Circle, etc. observe the coin and use that observation to secretly fulfill our conditional dessert order.
3. 3 The proof of this idea is left as an exercise for the reader. Obviously, we have ruled out alien babies.
4. 4 Wikipedia, “Bayes' theorem” (http://en.wikipedia.org/wiki/Bayes%27_theorem). Retrieved 2013-04-09.
5. 5 Were this not true, we would have never tried to conceive!
6. 6 Wikipedia, “Binomial distribution” (http://en.wikipedia.org/wiki/Binomial_distribution). Retrieved 2013-04-10.
7. 7 Not that would be any less successful, but hey, the math forced us to pick one.

© 2013 Charlton Rose. All rights reserved.