Talk:Permutation test

From Rosetta Code

Difference in results?

I see that the Tcl and Ursala code seem to be calculating different results. How can this be the case? I've been careful to check that the number of cases generated in the Tcl code is the correct one (92378 for selecting 10 from 19) so I think that's correct... –Donal Fellows 14:48, 1 February 2011 (UTC)

I am getting different results form everyone else, myself. I have 80551 cases which have a difference in means which is less than or equal to the result's difference in means (and note that this includes the result), and I have 11827 cases where the difference in means is greater than the result's difference in means. My smallest difference in means is -0.518111 and my largest difference in means is 0.501556. My difference in means for the original result is 0.153222. Since other people are getting different numbers, I am curious about what these statistics look like for them (or if I have made a mistake -- which should show up in these stats). --Rdm 20:42, 3 February 2011 (UTC)
Ah, I think I see: I have 313 cases which are equal to my result, that's 0.33% of the total. I am using an epsilon of approximately 2e-16 (which is much tighter than experimental accuracy). Differing epsilons, or differences in floating point implementations for systems without epsilon are enough to account the differences I currently see on the task page --Rdm 21:23, 3 February 2011 (UTC)

Name of task

Shouldn't this be all combinations instead of all permutations? What difference does the order of members within the control or treatment group make (other than extra computation time)? --Rdm 16:08, 1 February 2011 (UTC)

I'm currently using "all combinations", which is reasonably fast (though I'm careful to treat each sample point independently, just in case there are repeated samples). All permutations is much slower (or it is if you generate directly) and I'm not sure how much difference it should make to the result since you'd effectively just be multiplying the number of each count by (== 1316818944000 in this example; we'd be waiting a while for the result) and then dropping all that anyway when you average out. –Donal Fellows 21:46, 1 February 2011 (UTC)