# Birthday problem

 This page uses content from Wikipedia. The current wikipedia article is at Birthday Problem. The original RosettaCode article was extracted from the wikipedia article № 296054030 of 21:44, 12 June 2009 . The list of authors can be seen in the page history. As with Rosetta Code, the pre 5 June 2009 text of Wikipedia is available under the GNU FDL. (See links for details on variance)
Birthday problem is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.

In probability theory, the birthday problem, or birthday paradox This is not a paradox in the sense of leading to a logical contradiction, but is called a paradox because the mathematical truth contradicts naïve intuition: most people estimate that the chance is much lower than 50%. pertains to the probability that in a set of randomly chosen people some pair of them will have the same birthday. In a group of at least 23 randomly chosen people, there is more than 50% probability that some pair of them will both have been born on the same day. For 57 or more people, the probability is more than 99%, and it reaches 100% when the number of people reaches 366 (by the pigeon hole principle, ignoring leap years). The mathematics behind this problem leads to a well-known cryptographic attack called the birthday attack.

Using simulation, estimate the number of independent people required in a groups before we can expect a better than even chance that at least 2 independent people in a group share a common birthday. Furthermore: Simulate and thus estimate when we can expect a better than even chance that at least 3, 4 & 5 independent people of the group share a common birthday. For simplicity assume that all of the people are alive...

Suggestions for improvement
• Estimating the error in the estimate to help ensure the estimate is accurate to 4 decimal places.
• Converging to the ${\displaystyle n}$th solution using a root finding method, as opposed to using an extensive search.
• Kudos (κῦδος) for finding the solution by proof (in a programming language) rather than by construction and simulation.

## ALGOL 68

Works with: ALGOL 68 version Revision 1
Works with: ALGOL 68G version Any - tested with release algol68g-2.6.

File: Birthday_problem.a68<lang algol68>#!/usr/bin/a68g --script #

1. -*- coding: utf-8 -*- #

REAL desired probability := 0.5; # 50% #

REAL upb year = 365 + 1/4 # - 3/400 but alive, ignore those born prior to 1901 #, INT upb sample size = 100 000,

```   upb common = 5 ;
```

FORMAT name int fmt = \$g": "g(-0)"; "\$,

```      name real fmt = \$g": "g(-0,4)"; "\$,
name percent fmt = \$g": "g(-0,2)"%; "\$;
```

printf((

``` name real fmt,
"upb year",upb year,
name int fmt,
"upb common",upb common,
"upb sample size",upb sample size,
\$l\$
```

));

INT required common := 1; # initial value # FOR group size FROM required common WHILE required common <= upb common DO

``` INT sample with no required common := 0;
TO upb sample size DO
# generate sample #
[group size]INT sample;
FOR i TO UPB sample DO sample[i] := ENTIER(random * upb year) + 1 OD;
FOR birthday i TO UPB sample DO
INT birthday = sample[birthday i];
INT number in common := 1;
# special case = 1 #
IF number in common >= required common THEN
found required common
FI;
FOR birthday j FROM birthday i + 1 TO UPB sample DO
IF birthday = sample[birthday j] THEN
number in common +:= 1;
IF number in common >= required common THEN
found required common
FI
FI
OD
OD  # days in year #;
sample with no required common +:= 1;
found required common: SKIP
OD # sample size #;
REAL portion of years with required common birthdays =
(upb sample size - sample with no required common) / upb sample size;
print(".");
IF portion of years with required common birthdays > desired probability THEN
printf((
\$l\$,
name int fmt,
"required common",required common,
"group size",group size,
# "sample with no required common",sample with no required common, #
name percent fmt,
"%age of years with required common birthdays",portion of years with required common birthdays*100,
\$l\$
));
required common +:= 1
FI
```

OD # group size #</lang>Output:

```upb year: 365.2500; upb common: 5; upb sample size: 100000;
.
required common: 1; group size: 1; %age of years with required common birthdays: 100.00%;
......................
required common: 2; group size: 23; %age of years with required common birthdays: 50.71%;
.................................................................
required common: 3; group size: 88; %age of years with required common birthdays: 50.90%;
...................................................................................................
required common: 4; group size: 187; %age of years with required common birthdays: 50.25%;
...............................................................................................................................
required common: 5; group size: 314; %age of years with required common birthdays: 50.66%;
```

## Lasso

<lang Lasso>if(sys_listunboundmethods !>> 'randomgen') => { define randomgen(len::integer,max::integer)::array => { #len <= 0 ? return local(out = array) loop(#len) => { #out->insert(math_random(#max,1)) } return #out } } if(sys_listunboundmethods !>> 'hasdupe') => { define hasdupe(a::array,threshold::integer) => { with i in #a do => { #a->find(#i)->size > #threshold-1 ? return true }

return false } } local(threshold = 2) local(qty = 22, probability = 0.00, samplesize = 10000) while(#probability < 50.00) => {^ local(dupeqty = 0) loop(#samplesize) => { local(x = randomgen(#qty,365)) hasdupe(#x,#threshold) ? #dupeqty++ } #probability = (#dupeqty / decimal(#samplesize)) * 100

'Threshold: '+#threshold+', qty: '+#qty+' - probability: '+#probability+'\r' #qty += 1 ^}</lang>

Output:
```Threshold: 2, qty: 22 - probability: 47.810000
Threshold: 2, qty: 23 - probability: 51.070000

Threshold: 3, qty: 86 - probability: 48.400000
Threshold: 3, qty: 87 - probability: 49.200000
Threshold: 3, qty: 88 - probability: 52.900000

Threshold: 4, qty: 184 - probability: 48.000000
Threshold: 4, qty: 185 - probability: 49.800000
Threshold: 4, qty: 186 - probability: 49.600000
Threshold: 4, qty: 187 - probability: 48.900000
Threshold: 4, qty: 188 - probability: 50.700000

Threshold: 5, qty: 308 - probability: 48.130000
Threshold: 5, qty: 309 - probability: 48.430000
Threshold: 5, qty: 310 - probability: 48.640000
Threshold: 5, qty: 311 - probability: 49.370000
Threshold: 5, qty: 312 - probability: 49.180000
Threshold: 5, qty: 313 - probability: 49.540000
Threshold: 5, qty: 314 - probability: 50.000000
```

## PL/I

<lang PL/I>*process source attributes xref;

```bd: Proc Options(main);
/*--------------------------------------------------------------------
* 04.11.2013 Walter Pachl
* Take samp samples of groups with gs persons and check
*how many of the groups have at least match persons with same birthday
*-------------------------------------------------------------------*/
Dcl (float,random) Builtin;
Dcl samp Bin Fixed(31) Init(1000000);
Dcl arr(0:366) Bin Fixed(31);
Dcl r Bin fixed(31);
Dcl i Bin fixed(31);
Dcl ok Bin fixed(31);
Dcl g  Bin fixed(31);
Dcl gs Bin fixed(31);
Dcl match Bin fixed(31);
Dcl cnt(0:1) Bin Fixed(31);
Dcl lo(6) Bin Fixed(31) Init(0,21,85,185,311,458);
Dcl hi(6) Bin Fixed(31) Init(0,25,89,189,315,462);
Dcl rf Bin Float(63);
Dcl hits  Bin Float(63);
Dcl arrow Char(3);
Do match=2 To 6;
Put Edit(' ')(Skip,a);
Put Edit(samp,' samples. Percentage of groups with at least',
match,' matches')(Skip,f(8),a,f(2),a);
Put Edit('Group size')(Skip,a);
Do gs=lo(match) To hi(match);
cnt=0;
Do i=1 To samp;
ok=0;
arr=0;
Do g=1 To gs;
rf=random();
r=rf*365+1;
arr(r)+=1;
If arr(r)=match Then Do;
/* Put Edit(r)(Skip,f(4));*/
ok=1;
End;
End;
cnt(ok)+=1;
End;
hits=float(cnt(1))/samp;
If hits>=.5 Then arrow=' <-';
Else arrow=;
Put Edit(gs,cnt(0),cnt(1),100*hits,'%',arrow)
(Skip,f(10),2(f(7)),f(8,3),a,a);
End;
End;
End;</lang>
```

Output:

``` 1000000 samples. Percentage of groups with at least 2 matches
Group size                               3000000      500000 samples
21 556903 443097  44.310%        44.343%      44.347%
22 524741 475259  47.526%        47.549%      47.521%
23 492034 507966  50.797% <-     50.735% <-   50.722% <-
24 462172 537828  53.783% <-     53.815% <-   53.838% <-
25 431507 568493  56.849% <-     56.849% <-   56.842% <-

1000000 samples. Percentage of groups with at least 3 matches
Group size
85 523287 476713  47.671%        47.638%      47.631%
86 512219 487781  48.778%        48.776%      48.821%
87 499874 500126  50.013% <-     49.902%      49.903%
88 488197 511803  51.180% <-     51.127% <-   51.096% <-
89 478044 521956  52.196% <-     52.263% <-   52.290% <-

1000000 samples. Percentage of groups with at least 4 matches
Group size
185 511352 488648  48.865%        48.868%      48.921%
186 503888 496112  49.611%        49.601%      49.568%
187 497844 502156  50.216% <-     50.258% <-   50.297% <-
188 490490 509510  50.951% <-     50.916% <-   50.946% <-
189 482893 517107  51.711% <-     51.645% <-   51.655% <-

1000000 samples. Percentage of groups with at least 5 matches
Group size
311 508743 491257  49.126%        49.158%      49.164%
312 503524 496476  49.648%        49.631%      49.596%
313 498244 501756  50.176% <-     50.139% <-   50.095% <-
314 494032 505968  50.597% <-     50.636% <-   50.586% <-
315 489821 510179  51.018% <-     51.107% <-   51.114% <-

1000000 samples. Percentage of groups with at least 6 matches
Group size
458 505225 494775  49.478%        49.498%      49.512%
459 501871 498129  49.813%        49.893%      49.885%
460 497719 502281  50.228% <-     50.278% <-   50.248% <-
461 493948 506052  50.605% <-     50.622% <-   50.626% <-
462 489416 510584  51.058% <-     51.029% <-   51.055% <-```

extended to verify REXX results:

``` 1000000 samples. Percentage of groups with at least 7 matches
Group size
621 503758 496242  49.624%
622 500320 499680  49.968%
623 497047 502953  50.295% <-
624 493679 506321  50.632% <-
625 491240 508760  50.876% <-

1000000 samples. Percentage of groups with at least 8 matches
Group size
796 504764 495236  49.524%
797 502537 497463  49.746%
798 499488 500512  50.051% <-
799 496658 503342  50.334% <-
800 494773 505227  50.523% <-

1000000 samples. Percentage of groups with at least 9 matches
Group size
983 502613 497387  49.739%
984 501665 498335  49.834%
985 498606 501394  50.139% <-
986 497453 502547  50.255% <-
987 493816 506184  50.618% <-

1000000 samples. Percentage of groups with at least10 matches
Group size
1179 502910 497090  49.709%
1180 500906 499094  49.909%
1181 499079 500921  50.092% <-
1182 496957 503043  50.304% <-
1183 494414 505586  50.559% <-```

## Python

Note: the first (unused), version of function equal_birthdays() uses a different but equally valid interpretation of the phrase "common birthday". <lang python> from random import randint

def equal_birthdays(sharers=2, groupsize=23, rep=100000):

```   'Note: 4 sharing common birthday may have 2 dates shared between two people each'
g = range(groupsize)
sh = sharers - 1
eq = sum((groupsize - len(set(randint(1,365) for i in g)) >= sh)
for j in range(rep))
return (eq * 100.) / rep
```

def equal_birthdays(sharers=2, groupsize=23, rep=100000):

```   'Note: 4 sharing common birthday must all share same common day'
g = range(groupsize)
sh = sharers - 1
eq = 0
for j in range(rep):
group = [randint(1,365) for i in g]
if (groupsize - len(set(group)) >= sh and
any( group.count(member) >= sharers for member in set(group))):
eq += 1
return (eq * 100.) / rep
```

group_est = [2] for sharers in (2, 3, 4, 5):

```   groupsize = group_est[-1]+1
while equal_birthdays(sharers, groupsize, 100) < 50.:
# Coarse
groupsize += 1
for groupsize in range(int(groupsize - (groupsize - group_est[-1])/4.), groupsize + 999):
# Finer
eq = equal_birthdays(sharers, groupsize, 250)
if eq > 50.:
break
for groupsize in range(groupsize - 1, groupsize +999):
# Finest
eq = equal_birthdays(sharers, groupsize, 50000)
if eq > 50.:
break
group_est.append(groupsize)
print("%i independent people in a group of %s share a common birthday. (%5.1f)" % (sharers, groupsize, eq))</lang>
```
Output:
```2 independent people in a group of 23 share a common birthday. ( 50.9)
3 independent people in a group of 87 share a common birthday. ( 50.0)
4 independent people in a group of 188 share a common birthday. ( 50.9)
5 independent people in a group of 314 share a common birthday. ( 50.6)```

## REXX

The method used is to find the average number of people to share a common birthday, and then use the floor of that
value   (less the group size)   as a starting point to find a new group size with an expected size that exceeds 50%
common birthdays of the required size.

### version 1

<lang rexx>/*REXX programs solves "birthday problem" via random number simulations.*/ parse arg grps samp seed . /*get optional arguments from CL.*/ if grps== | grps==',' then grps=5 /*Not specified? Use the default*/ if samp== | samp==',' then samp=100000 /*" " " " " */ if seed\==',' & seed \== then call random ,,seed /*repeatability? */ diy =365 /*or: diy=365.25*/ /*the number of Days In a Year. */ diyM=diy*100 /*this expands the RANDOM range. */

```                                      /* [↓] get a rough estimate for %*/
do   g=2  to grps;     s=0        /*perform through 2──►group size.*/
do  samp;            @.=0       /*perform some number of trials. */
do j=1                    /*do until G dup birthdays found.*/
day=random(1,diyM) % 100  /*expand random number generation*/
@.day=@.day+1             /*record the  # common birthdays.*/
if @.day==g  then leave   /*'nuff G hits have occurred ··· */
end   /*j*/
s=s+j                           /*add # of birthday hits to sum. */
end         /*samp*/
```
```    start.g=s/samp%1-g                /*define where the try-outs start*/
end           /*g*/
```

___=' ' /*padding for easier eyeballing. */ say ___ ___ ___ 'sample size is ' samp /*show sample size of this run. */ say say ___ 'required' ___ 'group' ___ '% with required' say ___ ' common ' ___ ' size' ___ 'common birthdays' say ___ '────────' ___ '─────' ___ '────────────────'

```                                      /* [↓] where the try-outs happen.*/
do   g=2  to grps                  /*perform through 2──►group size.*/
do try=start.g;    s=0           /*perform try-outs until avg>50%.*/
do  samp;        @.=0          /*perform some number of trials. */
do j=1  for try           /*do until K dup birthdays found.*/
day=random(1,diyM) % 100  /*expand random number generation*/
@.day=@.day+1             /*record the  # common birthdays.*/
if @.day\==g then iterate /*not 'nuff G hits occurred? ··· */
s=s+1                     /*another common birthday found. */
leave                     /* ··· and stop looking for more.*/
end   /*j*/
end        /*samp*/
```
```     if s/samp>.5  then leave         /*if the average is > 50%, stop. */
end          /*try*/
```
```   say ___ center(g,8) ___  right(try,5)  ___ center((s/samp*100)'%',16)
end            /*g*/
/*stick a fork in it, we're done.*/</lang>
```

output when using the input of:   10

```                  sample size is  100000

required       group       %  with required
common         size       common birthdays
────────       ─────       ────────────────
2              23           50.70900%
3              88           51.23200%
4             187           50.15100%
5             314           50.77800%
6             460           50.00600%
7             623           50.64800%
8             798           50.00700%
9             985           50.13400%
10           1181           50.22200%
```

### version 2

<lang rexx> /*--------------------------------------------------------------------

```* 04.11.2013 Walter Pachl translated from PL/I
* Take samp samples of groups with gs persons and check
*how many of the groups have at least match persons with same birthday
*-------------------------------------------------------------------*/
samp=100000
lo='0 21 85 185 311 458'
hi='0 25 89 189 315 462'
Do match=2 To 6
Say ' '
Say samp' samples . Percentage of groups with at least',
match ' matches'
Say 'Group size'
Do gs=word(lo,match) To word(hi,match)
cnt.=0
Do i=1 To samp
ok=0
arr.=0
Do g=1 To gs
r=random(1,365)
arr.r=arr.r+1
If arr.r=match Then
ok=1
End
cnt.ok=cnt.ok+1
End
hits=cnt.1/samp
If hits>=.5 Then arrow=' <-'
Else arrow=
Say format(gs,10) cnt.0 cnt.1 100*hits||'%'||arrow
End
End</lang>
```

Output:

```100000 samples . Percentage of groups with at least 2  matches
Group size
21 55737 44263 44.26300%
22 52158 47842 47.84200%
23 49141 50859 50.85900% <-
24 46227 53773 53.77300% <-
25 43091 56909 56.90900% <-

100000 samples . Percentage of groups with at least 3  matches
Group size
85 52193 47807 47.80700%
86 51489 48511 48.51100%
87 50146 49854 49.85400%
88 48790 51210 51.2100% <-
89 47771 52229 52.22900% <-

100000 samples . Percentage of groups with at least 4  matches
Group size
185 50930 49070 49.0700%
186 50506 49494 49.49400%
187 49739 50261 50.26100% <-
188 49024 50976 50.97600% <-
189 48283 51717 51.71700% <-

100000 samples . Percentage of groups with at least 5  matches
Group size
311 50909 49091 49.09100%
312 50441 49559 49.55900%
313 49912 50088 50.08800% <-
314 49425 50575 50.57500% <-
315 48930 51070 51.0700% <-

100000 samples . Percentage of groups with at least 6  matches
Group size
458 50580 49420 49.4200%
459 49848 50152 50.15200% <-
460 49975 50025 50.02500% <-
461 49316 50684 50.68400% <-
462 49121 50879 50.87900% <-```