One of n lines in a file: Difference between revisions

From Rosetta Code
Content added Content deleted
(New draft task and Python solution.)
 
Line 20: Line 20:
Note: You may choose a smaller number of repetitions if necessary, but mention this up-front.
Note: You may choose a smaller number of repetitions if necessary, but mention this up-front.


=={{header|Perl 6}}==
{{trans|Python}}
<lang perl6>sub one_of_n($n) {
my $choice;
$choice = $_ if rand * $_ < 1 for 1 .. $n;
$choice - 1;
}

sub one_of_n_test($n = 10, $trials = 1000000) {
my @bins;
@bins[one_of_n($n)]++ for ^$trials;
@bins;
}

say one_of_n_test();</lang>
Output:
<pre>100288 100047 99660 99773 100256 99633 100161 100483 99789 99910</pre>
=={{header|Python}}==
=={{header|Python}}==
<lang python>from random import random as rnd
<lang python>from random import random as rnd

Revision as of 23:22, 7 September 2011

One of n lines in a file is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.

A method of choosing a line randomly from a file:

  • Without reading the file more than once
  • When substantial parts of the file cannot be held in memory
  • Without knowing how many lines are in the file

Is to:

  • keep the first line of the file as a possible choice, then
  • Read the second line of the file if possible and make it the possible choice if a uniform random value between zero and one is less than 1/2.
  • Read the third line of the file if possible and make it the possible choice if a uniform random value between zero and one is less than 1/3.
...
  • Read the Nth line of the file if possible and make it the possible choice if a uniform random value between zero and one is less than 1/N
  • Return the computed possible choice when no further lines exist in the file.
Task
  1. Create a function/method/routine called one_of_n that given n, the number of actual lines in a file, follows the algotrithm above to return an integer - the line number of the line chosen from the file.
    The number returned can vary, randomly, in each run.
  2. Use one_of_n in a simulation to find what woud be the chosen line of a 10 line file simulated 1,000,000 times.
  3. Print and show how many times each of the 10 lines is chosen as a rough measure of how well the algorithm works.

Note: You may choose a smaller number of repetitions if necessary, but mention this up-front.

Perl 6

Translation of: Python

<lang perl6>sub one_of_n($n) {

   my $choice;
   $choice = $_ if rand * $_ < 1 for 1 .. $n;
   $choice - 1;

}

sub one_of_n_test($n = 10, $trials = 1000000) {

   my @bins;
   @bins[one_of_n($n)]++ for ^$trials;
   @bins;

}

say one_of_n_test();</lang> Output:

100288 100047 99660 99773 100256 99633 100161 100483 99789 99910

Python

<lang python>from random import random as rnd

def one_of_n(n):

   # Zero based line numbers
   choice = 0
   for i in range(1, n):
       if rnd() < 1. / (i + 1.):
           choice = i
   return choice
           

def one_of_n_test(n=10, trials=1000000):

   bins = [0] * n
   if n:
       for i in range(trials):
           bins[one_of_n(n)] += 1
   return bins

print(one_of_n_test())</lang>

Sample output
[99833, 100303, 99902, 100132, 99608, 100117, 99531, 100017, 99795, 100762]