Unbias a random generator

From Rosetta Code
Revision as of 08:20, 22 March 2011 by 79.21.5.145 (talk) (Less heavy Java code)
Task
Unbias a random generator
You are encouraged to solve this task according to the task description, using any language you may know.

Given a weighted one bit generator of random numbers where the probability of a one occuring, , is not the same as , the probability of a zero occuring, the probability of the occurrence of a one followed by a zero is . This is the same as the probability of a zero followed by a one: .

Task Details

  • Use your language's random number generator to create a function/method/subroutine/... randN that returns a one or a zero, but with one occurring, on average, 1 out of N times, where N is an integer from the range 3 to 6 inclusive.
  • Create a function unbiased that uses only randN as its source of randomness to become an unbiased generator of random ones and zeroes.
  • For N over its range, generate and show counts of the outputs of randN and unbiased(randN).

The actual unbiasing should be done by generating two numbers at a time from randN and only returning a 1 or 0 if they are different. As long as you always return the first number or always return the second number, the probabilities discussed above should take over the biased probability of randN.

D

<lang d>import std.stdio, std.random;

bool biased(int n) {

   return uniform(0.0, 1.0) < (1.0 / n);

}

bool unbiased(int n) {

   bool a, b;
   do {
       a = biased(n);
       b = biased(n);
   } while (a == b);
   return a;

}

void main() {

   int M = 50_000;
   foreach (n; 3 .. 7) {
       int c1, c2;
       foreach (i; 0 .. M) {
           c1 += biased(n);
           c2 += unbiased(n);
       }
       writefln("%d: %2.2f%%  %2.2f%%", n, 100.0*c1/M, 100.0*c2/M);
   }

}</lang> Output:

3: 33.12%  50.15%
4: 25.13%  50.02%
5: 19.61%  50.01%
6: 16.70%  49.98%

J

<lang j>randN=: 0 = ? unbiased=: i.@# { ::$: 2 | 0 3 -.~ _2 #.\ 4&* randN@# ]</lang>

Example use:

<lang j> randN 10#6 1 0 0 0 1 0 0 0 0 0

  unbiased 10#6

1 0 0 1 0 0 1 0 1 1</lang>

Some example counts (these are counts of the number of 1s which appear in a test involving 100 random numbers):

<lang j> +/randN 100#3 30

  +/randN 100#4

20

  +/randN 100#5

18

  +/randN 100#6

18

  +/unbiased 100#3

49

  +/unbiased 100#4

46

  +/unbiased 100#5

49

  +/unbiased 100#6

47</lang>

Note that these results are random. For example, a re-run of +/randN 100#5 gave 25 as its result, and a re-run of +/unbiased 100#5 gave 52 as its result.

Java

<lang java>public class Bias {

   public static boolean biased(int n) {
       return Math.random() < 1.0 / n;
   }
   public static boolean unbiased(int n) {
       boolean a, b;
       do {
           a = biased(n);
           b = biased(n);
       } while (a == b);
       return a;
   }
   public static void main(String[] args) {
       final int M = 50000;
       for (int n = 3; n < 7; n++) {
           int c1 = 0, c2 = 0;
           for (int i = 0; i < M; i++) {
               c1 += biased(n) ? 1 : 0;
               c2 += unbiased(n) ? 1 : 0;
           }
           System.out.format("%d: %2.2f%%  %2.2f%%\n",
                             n, 100.0*c1/M, 100.0*c2/M);
       }
   }

}</lang> Output:

3: 33,11%  50,23%
4: 24.97%  49.78%
5: 20.05%  50.00%
6: 17.00%  49.88%

PicoLisp

<lang PicoLisp>(de randN (N)

  (if (= 1 (rand 1 N)) 1 0) )

(de unbiased (N)

  (use (A B)
     (while
        (=
           (setq A (randN N))
           (setq B (randN N)) ) )
     A ) )</lang>

Test: <lang PicoLisp>(for N (range 3 6)

  (tab (2 1 7 2 7 2)
     N ":"
     (format
        (let S 0 (do 10000 (inc 'S (randN N))))
        2 )
     "%"
     (format
        (let S 0 (do 10000 (inc 'S (unbiased N))))
        2 )
     "%" ) )</lang>

Output:

 3:  33.21 %  50.48 %
 4:  25.06 %  49.79 %
 5:  20.04 %  49.75 %
 6:  16.32 %  49.02 %

Python

<lang python>from __future__ import print_function import random

def randN(N):

   " 1,0 random generator factory with 1 appearing 1/N'th of the time"
   choose = [1] + [0] * (N-1)
   def rand():
       return random.choice( rand.choose )
   rand.choose = choose
   return rand

def unbiased(biased):

   'uses a biased() generator of 1 or 0, to create an unbiased one'
   this, that = biased(), biased()
   while this == that: # Loop until 10 or 01
       this, that = biased(), biased()
   return this         # return the first

if __name__ == '__main__':

   from collections import namedtuple
   Stats = namedtuple('Stats', 'count1 count0 percent')
   for N in range(3, 7):
       biased = randN(N)
       v = [biased() for x in range(1000000)]
       v1, v0 = v.count(1), v.count(0)
       print ( "Biased(%i)  = %r" % (N, Stats(v1, v0, 100. * v1/(v1 + v0))) )
       v = [unbiased(biased) for x in range(1000000)]
       v1, v0 = v.count(1), v.count(0)
       print ( "  Unbiased = %r" % (Stats(v1, v0, 100. * v1/(v1 + v0)), ) )</lang>

Sample output

Biased(3)  = Stats(count1=331800, count0=668200, percent=33.18)
  Unbiased = Stats(count1=499740, count0=500260, percent=49.973999999999997)
Biased(4)  = Stats(count1=249770, count0=750230, percent=24.977)
  Unbiased = Stats(count1=499707, count0=500293, percent=49.970700000000001)
Biased(5)  = Stats(count1=199764, count0=800236, percent=19.976400000000002)
  Unbiased = Stats(count1=499456, count0=500544, percent=49.945599999999999)
Biased(6)  = Stats(count1=167561, count0=832439, percent=16.7561)
  Unbiased = Stats(count1=499963, count0=500037, percent=49.996299999999998)

Tcl

<lang tcl># 1,0 random generator factory with 1 appearing 1/N'th of the time proc randN n {expr {rand()*$n < 1}}

  1. uses a biased generator of 1 or 0, to create an unbiased one

proc unbiased {biased} {

   while 1 {

if {[set a [eval $biased]] != [eval $biased]} {return $a}

   }

}

for {set n 3} {$n <= 6} {incr n} {

   set biased [list randN $n]
   for {set i 0;array set c {0 0 1 0}} {$i < 1000000} {incr i} {

incr c([eval $biased])

   }
   puts [format "  biased %d => #0=%d #1=%d ratio=%.2f%%" $n $c(0) $c(1) \

[expr {100.*$c(1)/$i}]]

   for {set i 0;array set c {0 0 1 0}} {$i < 1000000} {incr i} {

incr c([unbiased $biased])

   }
   puts [format "unbiased %d => #0=%d #1=%d ratio=%.2f%%" $n $c(0) $c(1) \

[expr {100.*$c(1)/$i}]] }</lang> Sample output:

  biased 3 => #0=667076 #1=332924 ratio=33.29%
unbiased 3 => #0=500263 #1=499737 ratio=49.97%
  biased 4 => #0=750470 #1=249530 ratio=24.95%
unbiased 4 => #0=500644 #1=499356 ratio=49.94%
  biased 5 => #0=800243 #1=199757 ratio=19.98%
unbiased 5 => #0=500878 #1=499122 ratio=49.91%
  biased 6 => #0=833623 #1=166377 ratio=16.64%
unbiased 6 => #0=500518 #1=499482 ratio=49.95%