Birthday problem: Difference between revisions
Walterpachl (talk | contribs) m (→{{header|REXX}}: oops) |
m (→{{header|REXX}}: optimized for speed because of large loop) |
||
Line 100: | Line 100: | ||
The method used: using simulation, this program finds (and records) when a pseudo-random number is used (in a large sample size trial) and finds a number of people (group size of independent people) which has been reached that share a common birthday with a probability that exceeds 50%. This whole number is then averaged with the multiple finds and then interpolation is used in the result shown. |
The method used: using simulation, this program finds (and records) when a pseudo-random number is used (in a large sample size trial) and finds a number of people (group size of independent people) which has been reached that share a common birthday with a probability that exceeds 50%. This whole number is then averaged with the multiple finds and then interpolation is used in the result shown. |
||
<lang rexx>/*REXX programs solves "birthday problem" via random number simulations.*/ |
<lang rexx>/*REXX programs solves "birthday problem" via random number simulations.*/ |
||
parse arg |
parse arg grps samp seed . /*get optional arguments from CL */ |
||
if |
if grps=='' | grps==',' then grps=5 /*Not specified? Use the default*/ |
||
if samp=='' | samp==',' then samp=100000 /*" " " " " */ |
if samp=='' | samp==',' then samp=100000 /*" " " " " */ |
||
if seed\==',' & seed \=='' then call random ,,seed /*repeatability? */ |
if seed\==',' & seed \=='' then call random ,,seed /*repeatability? */ |
||
!.=0 /*used for tracking pop averages.*/ |
|||
diy=365 /*or: diy=365.25*/ /*the number of Days In a Year. */ |
|||
⚫ | |||
do |
do k=2 to grps; s=0 /*perform through 2──►group size.*/ |
||
do |
do samp /*perform some number of trials. */ |
||
overH=0 /*require that we need over 50%. */ |
|||
@.=0 /*initialize birthday occurrences*/ |
@.=0 /*initialize birthday occurrences*/ |
||
do j=1 /* |
do j=1 /*do until K dup birthdays found.*/ |
||
day=random(1, |
day=random(1,diy*100) % 100 /*expand random number generation*/ |
||
@.day=@.day+1 /* |
@.day=@.day+1 /*record the of a particular Bday*/ |
||
if @.day==k then |
if @.day==k then leave /*when K hits have occurred ...*/ |
||
if overH then leave /*we reached > 50%.*/ |
|||
⚫ | |||
⚫ | |||
end |
|||
end /*j*/ |
end /*j*/ |
||
s=s+j /*add number of people to sum. */ |
|||
end /*samp*/ |
|||
⚫ | |||
end /*k*/ |
|||
pad=' ' /*padding for easier eyeballing. */ |
pad=' ' /*padding for easier eyeballing. */ |
||
Line 130: | Line 127: | ||
say pad '────────' pad '─────' pad '────────────────' |
say pad '────────' pad '─────' pad '────────────────' |
||
do g= |
do g=2 to grps /*show all the groups simulated. */ |
||
⚫ | |||
bias=1/(1+(overH==overH)-(g==1))*100/1 /*bias: used when groupSize=1*/ |
|||
⚫ | |||
⚫ | |||
⚫ | |||
end /*g*/ |
end /*g*/ |
||
/*stick a fork in it, we're done.*/</lang> |
/*stick a fork in it, we're done.*/</lang> |
||
Line 143: | Line 139: | ||
common size common birthdays |
common size common birthdays |
||
──────── ───── ──────────────── |
──────── ───── ──────────────── |
||
1 1 100 |
|||
2 24 50.0274786 |
2 24 50.0274786 |
||
3 88 50.0083309 |
3 88 50.0083309 |
Revision as of 20:07, 4 November 2013
This page uses content from Wikipedia. The current wikipedia article is at Birthday Problem. The original RosettaCode article was extracted from the wikipedia article № 296054030 of 21:44, 12 June 2009 . The list of authors can be seen in the page history. As with Rosetta Code, the pre 5 June 2009 text of Wikipedia is available under the GNU FDL. (See links for details on variance) |
In probability theory, the birthday problem, or birthday paradox This is not a paradox in the sense of leading to a logical contradiction, but is called a paradox because the mathematical truth contradicts naïve intuition: most people estimate that the chance is much lower than 50%. pertains to the probability that in a set of randomly chosen people some pair of them will have the same birthday. In a group of at least 23 randomly chosen people, there is more than 50% probability that some pair of them will both have been born on the same day. For 57 or more people, the probability is more than 99%, and it reaches 100% when the number of people reaches 366 (by the pigeon hole principle, ignoring leap years). The mathematics behind this problem leads to a well-known cryptographic attack called the birthday attack.
- Task
Using simulation, estimate the number of independent people required in a groups before we can expect a better than even chance that at least 2 independent people in a group share a common birthday. Furthermore: Simulate and thus estimate when we can expect a better than even chance that at least 3, 4 & 5 independent people of the group share a common birthday. For simplicity assume that all of the people are alive...
- Suggestions for improvement
- Estimating the error in the estimate to help ensure the estimate is accurate to 4 decimal places.
- Converging to the th solution using a root finding method, as opposed to using an extensive search.
- Kudos (κῦδος) for finding the solution by proof (in a programming language) rather than by construction and simulation.
- See also
ALGOL 68
File: Birthday_problem.a68<lang algol68>#!/usr/bin/a68g --script #
- -*- coding: utf-8 -*- #
REAL desired probability := 0.5; # 50% #
REAL upb year = 365 + 1/4 # - 3/400 but alive, ignore those born prior to 1901 #, INT upb sample size = 100 000,
upb common = 5 ;
FORMAT name int fmt = $g": "g(-0)"; "$,
name real fmt = $g": "g(-0,4)"; "$, name percent fmt = $g": "g(-0,2)"%; "$;
printf((
name real fmt, "upb year",upb year, name int fmt, "upb common",upb common, "upb sample size",upb sample size, $l$
));
INT required common := 1; # initial value # FOR group size FROM required common WHILE required common <= upb common DO
INT sample with no required common := 0; TO upb sample size DO # generate sample # [group size]INT sample; FOR i TO UPB sample DO sample[i] := ENTIER(random * upb year) + 1 OD; FOR birthday i TO UPB sample DO INT birthday = sample[birthday i]; INT number in common := 1; # special case = 1 # IF number in common >= required common THEN found required common FI; FOR birthday j FROM birthday i + 1 TO UPB sample DO IF birthday = sample[birthday j] THEN number in common +:= 1; IF number in common >= required common THEN found required common FI FI OD OD # days in year #; sample with no required common +:= 1; found required common: SKIP OD # sample size #; REAL portion of years with required common birthdays = (upb sample size - sample with no required common) / upb sample size; print("."); IF portion of years with required common birthdays > desired probability THEN printf(( $l$, name int fmt, "required common",required common, "group size",group size, # "sample with no required common",sample with no required common, # name percent fmt, "%age of years with required common birthdays",portion of years with required common birthdays*100, $l$ )); required common +:= 1 FI
OD # group size #</lang>Output:
upb year: 365.2500; upb common: 5; upb sample size: 100000; . required common: 1; group size: 1; %age of years with required common birthdays: 100.00%; ...................... required common: 2; group size: 23; %age of years with required common birthdays: 50.71%; ................................................................. required common: 3; group size: 88; %age of years with required common birthdays: 50.90%; ................................................................................................... required common: 4; group size: 187; %age of years with required common birthdays: 50.25%; ............................................................................................................................... required common: 5; group size: 314; %age of years with required common birthdays: 50.66%;
REXX
The method used: using simulation, this program finds (and records) when a pseudo-random number is used (in a large sample size trial) and finds a number of people (group size of independent people) which has been reached that share a common birthday with a probability that exceeds 50%. This whole number is then averaged with the multiple finds and then interpolation is used in the result shown. <lang rexx>/*REXX programs solves "birthday problem" via random number simulations.*/ parse arg grps samp seed . /*get optional arguments from CL */ if grps== | grps==',' then grps=5 /*Not specified? Use the default*/ if samp== | samp==',' then samp=100000 /*" " " " " */ if seed\==',' & seed \== then call random ,,seed /*repeatability? */ !.=0 /*used for tracking pop averages.*/ diy=365 /*or: diy=365.25*/ /*the number of Days In a Year. */
/* [↑] Feb 29th's B-day ≡ Mar 1.*/ do k=2 to grps; s=0 /*perform through 2──►group size.*/ do samp /*perform some number of trials. */ @.=0 /*initialize birthday occurrences*/ do j=1 /*do until K dup birthdays found.*/ day=random(1,diy*100) % 100 /*expand random number generation*/ @.day=@.day+1 /*record the of a particular Bday*/ if @.day==k then leave /*when K hits have occurred ...*/ end /*j*/ s=s+j /*add number of people to sum. */ end /*samp*/ !.k=s /*save the total number of people*/ end /*k*/
pad=' ' /*padding for easier eyeballing. */ say ' sample size is ' samp /*show sample size of this run. */ say say pad 'required' pad 'group' pad '% with required' say pad ' common ' pad ' size' pad 'common birthdays' say pad '────────' pad '─────' pad '────────────────'
do g=2 to grps /*show all the groups simulated. */ d=!.g/samp /*average common birthday size. */ say pad center(g,8) pad right(d%1,5) pad center((50+(d-d%1)/d)'%',16) end /*g*/ /*stick a fork in it, we're done.*/</lang>
output
sample size is 100000 required group % with required common size common birthdays ──────── ───── ──────────────── 2 24 50.0274786 3 88 50.0083309 4 187 50.0016825 5 311 50.0020498