Verify distribution uniformity/Naive: Difference between revisions

Content added Content deleted
m (Adjust categorization)
(Remove chi-squared stuff to another task, add cross-link)
Line 14: Line 14:


Show the distribution checker working when the produced distribution is flat enough and when it is not. (Use a generator from [[Seven-dice from Five-dice]]).
Show the distribution checker working when the produced distribution is flat enough and when it is not. (Use a generator from [[Seven-dice from Five-dice]]).

'''See also:'''
*[[Verify Distribution Uniformity with Chi-Squared Test‎]]


=={{header|Python}}==
=={{header|Python}}==
Line 64: Line 67:
0 10003 1 9851 2 10058 3 10193 4 10126 5 10002 6 9852 7 9964 8 9957 9 9994
0 10003 1 9851 2 10058 3 10193 4 10126 5 10002 6 9852 7 9964 8 9957 9 9994
<span style="color:red">distribution potentially skewed for 0: expected around 50000, got 94873</span>
<span style="color:red">distribution potentially skewed for 0: expected around 50000, got 94873</span>

An alternative is to use the [[wp:Pearson's chi-square test|<math>\chi^2</math> test]] to see whether the hypothesis that the data is uniformly distributed is satisfied.

{{works with|Tcl|8.5}}
{{libheader|tcllib}}
<lang tcl>package require math
interp alias {} tcl::mathfunc::lnGamma {} math::ln_Gamma
proc tcl::mathfunc::chi2 {k x} {
set k2 [expr {$k / 2.0}]
expr {exp(log(0.5)*$k2 + log($x) * ($k2 - 1) - $x/2.0 - lnGamma($k2))}
}

proc isUniform {distribution {significance 0.05}} {
set count [tcl::mathop::+ {*}[dict values $distribution]]
set expected [expr {double($count) / [dict size $distribution]}]
set X2 0.0
foreach value [dict values $distribution] {
set X2 [expr {$X2 + ($value - $expected)**2 / $expected}]
}
set freedom [expr {[dict size $distribution] - 1}]
expr {chi2($freedom, $X2) > $significance}
}</lang>
The computing of the distribution to check is trivial (and part of the <code>distcheck</code>) and so is omitted here for clarity.