The Birthday Problem
The birthday problem (or birthday paradox) is a classic question in probability. It asks: in a group of people, what is the probability that at least two individuals share the same birthday? Surprisingly, the probability reaches 50% with just 23 people in the group. I thought it would be cool to implement a simulation of the birthday problem. Here's what I did.
Explanation
Used directly, the naive definition says we just need to count the number of ways to assign birthdays to people such that there are two or more people who share a birthday.
But this counting problem is hard, since it could be Emma and Steve who share a birthday, or Steve and Naomi, or all three of them, or the three of them could share a birthday while two others in the group share a different birthday,
or various other possibilities.
Instead, we may find that taking the complement may be easier. Let’s denote:
\( n \) as the number of people in the room,
\( P(\text{no match}) \): the probability that all \( n \) people have unique birthdays, and
\( P(\text{at least one match}) \): the probability that at least two people share a birthday.
We start by calculating \( P(\text{no match}) \) and then use it to find \( P(\text{at least one match}) \).
Calculating \( P(\text{no match}) \)
Let's count how many ways we can assign birthdays to \( k \) people such that no two people share a birthday. The number of possibilities can be expresseed as:
\[ P(\text{no match}) = \frac{365 \cdot 364 ...(365 - k +1)}{365^k} \text{ or } \prod_{k=0}^{n-1} ( 1 - \frac{k}{365}) \], and the complement (at least one match) is: \[ P(\text{at least one match}) = 1 - \prod_{k=0}^{n-1} (1 - \frac{k}{365}) \] This is sort of a cumbersome calculation. Luckily, we can estimate the Probability of not matching for small \( k \) as the Taylor approximation of \( e^{-x} \) \( \approx e^\frac{-k}{365} \). This allows us to convert our product into a sum. \[ P(\text{no match}) = e^{-\frac{1}{365} \cdot \sum_{k=0}^{n-1} k} \implies e^{-\frac{(n-1)n}{2 \cdot 365}} \] \[ \therefore P(\text{match}) = 1 - e^{-\frac{(n-1)n}{2 \cdot 365}} \]
Surprisingly, at around \( k = 23 \) people we find that our probability of success becomes about \( \approx 0.5 \). We can verify this by actually simulating the situation in python.
Code
import random
def birthdaymatch():
birthdays = set()
count = 0
for i in range(367):
birthday = random.randint(1, 365)
count += 1
if birthday in birthdays:
return count
birthdays.add(birthday)
results = [birthdaymatch() for _ in range(100)]
print('Average number of people until match:', sum(results) / len(results))
---------------------------------------------------------------------------
Average amount of people until match: 23.64
Summary
When you run this simulation, you should find that the average number of people required to find a shared birthday is close to 23, confirming the mathematical result. The birthday problem is a great example of how probability can challenge our intuition, and it’s also a fun way to explore simulations in Python. If you’d like to try running the code, experiment with different numbers of simulations and group sizes to see how the probability behaves.