Talk:One of n lines in a file: Difference between revisions

m
Line 53:
:: You raise a very good point. It seems I was incorrect by adding that particular assumption. Taking that mis-step into account, the distribution of line choices does follow a normal distribution when the process is repeat for each line of the file. Thanks for helping me see the light ;). --[[User:Demivec|Demivec]] 18:04, 11 December 2011 (UTC)
 
:::Hi Demivec, there shouldn't be a [[wp:Normal distribution|normal distribution]]. The counts of how many times a particular line is chosen should be roughly the same (as shown by the other examples). --[[User:Paddy3118|Paddy3118]] 18:16, 11 December 2011 (UTC)
 
::: It's supposed to give uniform distribution ("normal distribution" means something entirely different) among all lines. Think it this way: when you read in the n-th line, it has 1/n chance of becoming the selected one, overriding the previous selection. Meanwhile, lines from 1 to n-1 each has equal chance of remaining selected, that is, 1/n. If you are familiar with mathematical induction, it's pretty easy to prove. If you are not, try the simple cases of n=2 and n=3 by hand, and you'll see. --[[User:Ledrug|Ledrug]] 18:11, 11 December 2011 (UTC)
Anonymous user