Talk:One of n lines in a file: Difference between revisions

(→‎Simulation: Renamed header)
 
(7 intermediate revisions by 3 users not shown)
Line 49:
 
According to the task description ''it says that after each line is read a random float value is checked against the fraction 1/n where n is the line just read, to see if the process should stop, or repeat''. This would mean the probability for each subsequent line is the combination of the probabilities of the lines that preceded it. The simplist case, the first line in the file, should have a 1/2 chance of being used (kept). All of the solutions (except PureBasic's) show more or less an equal distribution between the 10 lines of the simulated file. That would imply all those solutions are incorrect. --[[User:Demivec|Demivec]] 17:27, 11 December 2011 (UTC)
 
: Where did you get the "the process should stop" part? --[[User:Ledrug|Ledrug]] 17:44, 11 December 2011 (UTC)
:: You raise a very good point. It seems I was incorrect by adding that particular assumption. Taking that mis-step into account, the distribution of line choices does follow a normal distribution when the process is repeat for each line of the file. Thanks for helping me see the light ;). --[[User:Demivec|Demivec]] 18:04, 11 December 2011 (UTC)
 
:::Hi Demivec, there shouldn't be a [[wp:Normal distribution|normal distribution]]. The counts of how many times a particular line is chosen should be roughly the same (as shown by the other examples). --[[User:Paddy3118|Paddy3118]] 18:16, 11 December 2011 (UTC)
 
::: It's supposed to give uniform distribution ("normal distribution" means something entirely different) among all lines. Think it this way: when you read in the n-th line, it has 1/n chance of becoming the selected one, overriding the previous selection. Meanwhile, lines from 1 to n-1 each has equal chance of remaining selected, that is, 1/n. If you are familiar with mathematical induction, it's pretty easy to prove. If you are not, try the simple cases of n=2 and n=3 by hand, and you'll see. --[[User:Ledrug|Ledrug]] 18:11, 11 December 2011 (UTC)
 
:::: Yes, "uniform distribution" is what I should have said. It was easy enough to prove it to myself that all was in order once you pointed out my inadvertent blunder. Thanks again.--[[User:Demivec|Demivec]] 18:20, 11 December 2011 (UTC)
Anonymous user