Talk:Constrained random points on a circle: Difference between revisions

From Rosetta Code
Content added Content deleted
(→‎Uniform distribution: Subtle errors are possible)
m (sign my previous comment)
Line 103: Line 103:
</pre>
</pre>


Here you can see that the number of points near top/bottom is greater than that for left/right sides of the plot. For this particular case, the simplest way to check for the bias is to count the number of points where abs(y) > abs(x) (this effectively partitions the plot using 45 degree lines) -- for my "bad" code I see a ratio (over multiple runs) of 54:46 in favor of the top/bottom quadrants over left/right. Counted another way, 79% of random seeds result in a plot with more top/bottom points than left/right.
Here you can see that the number of points near top/bottom is greater than that for left/right sides of the plot. For this particular case, the simplest way to check for the bias is to count the number of points where abs(y) > abs(x) (this effectively partitions the plot using 45 degree lines) -- for my "bad" code I see a ratio (over multiple runs) of 54:46 in favor of the top/bottom quadrants over left/right. Counted another way, 79% of random seeds result in a plot with more top/bottom points than left/right. -[[User:Davewhipp|Dave]]


: An example of a more subtle error is to pick the random point using a polar coordinate system (i.e., using a random distance over the given range and a random angle). The problem is that the distribution of random points is not even w.r.t. area when picked that way; points that are closer in will be more tightly packed. It becomes much more noticeable with a wider annulus. –[[User:Dkf|Donal Fellows]] 15:07, 3 September 2010 (UTC)
: An example of a more subtle error is to pick the random point using a polar coordinate system (i.e., using a random distance over the given range and a random angle). The problem is that the distribution of random points is not even w.r.t. area when picked that way; points that are closer in will be more tightly packed. It becomes much more noticeable with a wider annulus. –[[User:Dkf|Donal Fellows]] 15:07, 3 September 2010 (UTC)

Revision as of 15:43, 3 September 2010

Not 100 points

There are only 89 points in the circle shown in the verilog example output. This is no surprise, because AFAICS the algorithm doesn't make sure that the same point isn't chosen twice. Now given that it's the first example, I guess it's what was meant by the task description, but then the task description probably should be changed to reflect the fact that less points are OK. --Ce 10:55, 3 September 2010 (UTC)

I've fixed the description: it would be a poor RNG that doesn't produce duplicates --Dave

How to check the code

If you increase the number of points produced to 10k, you should get output rather like this (generated with Tcl version; your version may differ). This lets you check that the spread of points produces the expected annulus. –Donal Fellows 11:00, 3 September 2010 (UTC)

               X               
          XXXXXXXXXXX          
        XXXXXXXXXXXXXXX        
      XXXXXXXXXXXXXXXXXXX      
     XXXXXXXXXXXXXXXXXXXXX     
    XXXXXXXXXXXXXXXXXXXXXXX    
   XXXXXXXX         XXXXXXXX   
   XXXXXXX           XXXXXXX   
  XXXXXX               XXXXXX  
  XXXXXX               XXXXXX  
 XXXXXX                 XXXXXX 
 XXXXX                   XXXXX 
 XXXXX                   XXXXX 
 XXXXX                   XXXXX 
 XXXXX                   XXXXX 
XXXXXX                   XXXXXX
 XXXXX                   XXXXX 
 XXXXX                   XXXXX 
 XXXXX                   XXXXX 
 XXXXX                   XXXXX 
 XXXXXX                 XXXXXX 
  XXXXXX               XXXXXX  
  XXXXXX               XXXXXX  
   XXXXXXX           XXXXXXX   
   XXXXXXXX         XXXXXXXX   
    XXXXXXXXXXXXXXXXXXXXXXX    
     XXXXXXXXXXXXXXXXXXXXX     
      XXXXXXXXXXXXXXXXXXX      
        XXXXXXXXXXXXXXX        
          XXXXXXXXXXX          
               X               

Uniform distribution

I'm not very good with stats, but I've seen discussions pop up before about different distributions of points. Is uniform vs normal vs (something?) a significant component of the task? How may it be verified with at most 100 points? --Michael Mol 12:27, 3 September 2010 (UTC)

I guess that if you count the number of points that lie on a line that passes through the center, then the count should not depend on the angle of that line. So it would be wrong to simply pick a uniform-random value of x and then a uniform-random of y at that location, because that would tend to lead to a higher density of points on the left and right sides (look at the maximally dense version above: there are 12 possible values of y at x==0; but only one at x == +/- 15) -- and thus a greater number of points away from the x-axis. A common mistake when generating a set of linked random variables is that the distribution depends on the order in which they are generated.

So, take this Perl code:

my @bitmap = map { " " x 32 } 0 .. 33;
for (1 .. 100) {
    my $x = int rand(31) - 15;


    my $max = sqrt( 225-($x*$x) );
    my $min = 100-($x*$x);
    $min = $min > 0 ? sqrt $min : 0;

    my $y = int rand( 1+$max-$min ) + $min;
    $y = -$y if rand() < .5;


    $x += 16;
    $y += 16;
    #print "$x $y\n";
    substr( $bitmap[$y], $x, 1, "#" );
}

print "$_\n" for @bitmap;


               # #
          #    ##    #
       #   ## ##  ##   #
      #        ##  #   #
         # # # #   #  # #
      #           # #
        ##          #   #
        #
  #                      #
     #                       #
 ## #                    #
  #
 #
                           #
 ## #                         #
  #                        #
    ##                      #

                             #
 #                      #

       #                #  #
   #     #
        ## #      # #
        #     #   #
     #  # #       ##  #
        # #  #      ##
            #       #
             # #

Here you can see that the number of points near top/bottom is greater than that for left/right sides of the plot. For this particular case, the simplest way to check for the bias is to count the number of points where abs(y) > abs(x) (this effectively partitions the plot using 45 degree lines) -- for my "bad" code I see a ratio (over multiple runs) of 54:46 in favor of the top/bottom quadrants over left/right. Counted another way, 79% of random seeds result in a plot with more top/bottom points than left/right. -Dave

An example of a more subtle error is to pick the random point using a polar coordinate system (i.e., using a random distance over the given range and a random angle). The problem is that the distribution of random points is not even w.r.t. area when picked that way; points that are closer in will be more tightly packed. It becomes much more noticeable with a wider annulus. –Donal Fellows 15:07, 3 September 2010 (UTC)