Statistics/Chi-squared distribution: Difference between revisions

Python example
m (change function names to reflect text at top)
(Python example)
Line 163:
 
For the airport data, diff total is 4.512820512820512, χ2 is 0.08875392598443503, p value 0.7888504263193064
</pre>
 
 
=={{header|Python}}==
<syntaxhighlight lang="python">''' rosettacode.org/wiki/Statistics/Chi-squared_distribution#Python '''
 
 
from math import gamma, exp
from scipy.special import gammainc
 
 
def χ2(x, k):
''' Chi-squared function, the probability distribution function (pdf) for chi-squared '''
return x ** (k/2 - 1) * exp(-x/2) / (2 ** (k/2) * gamma(k / 2)) if x > 0 else 0.0
 
 
def cdf_χ2(x, k):
''' Cumulative probability function (cdf) for chi-squared '''
if x <= 0 or k <= 0:
return 0
return gammainc(k / 2, x / 2)
 
 
print('x χ2 k = 1 k = 2 k = 3 k = 4 k = 5')
print('-' * 93)
for x in range(11):
print(f'{x:2}', end='')
for k in range(1, 6):
print(f'{χ2(x, k):16.8}', end='\n' if k % 5 == 0 else '')
 
 
print('\nχ2 x P value (df=3)\n----------------------')
for p in [1, 2, 4, 8, 16, 32]:
print(f'{p:2}', ' ', 1.0 - cdf_χ2(p, 3))
 
 
AIRPORT_DATA = [[77, 23], [88, 12], [79, 21], [81, 19]]
 
EXPECTED = [[81.25, 18.75],
[81.25, 18.75],
[81.25, 18.75],
[81.25, 18.75]]
 
DTOTAL = sum((d[pos] - EXPECTED[i][pos])**2 / EXPECTED[i][pos]
for i, d in enumerate(AIRPORT_DATA) for pos in [0, 1])
 
print(
f'\nFor the airport data, diff total is {DTOTAL}, χ2 is {χ2(DTOTAL, 3)}, p value {cdf_χ2(DTOTAL, 3)}')
</syntaxhighlight>{{out}}
<pre>
x χ2 k = 1 k = 2 k = 3 k = 4 k = 5
---------------------------------------------------------------------------------------------
0 0.0 0.0 0.0 0.0 0.0
1 0.24197072 0.30326533 0.24197072 0.15163266 0.080656908
2 0.10377687 0.18393972 0.20755375 0.18393972 0.13836917
3 0.051393443 0.11156508 0.15418033 0.16734762 0.15418033
4 0.026995483 0.067667642 0.10798193 0.13533528 0.14397591
5 0.014644983 0.041042499 0.073224913 0.10260625 0.12204152
6 0.0081086956 0.024893534 0.048652173 0.074680603 0.097304347
7 0.0045533429 0.015098692 0.0318734 0.052845421 0.074371268
8 0.0025833732 0.0091578194 0.020666985 0.036631278 0.055111961
9 0.0014772828 0.0055544983 0.013295545 0.024995242 0.039886636
10 0.00085003666 0.0033689735 0.0085003666 0.016844867 0.028334555
 
χ2 x P value (df=3)
----------------------
1 0.8012519569012009
2 0.5724067044708798
4 0.26146412994911117
8 0.04601170568923141
16 0.0011339842897852837
32 5.233466447984725e-07
 
For the airport data, diff total is 4.512820512820513, χ2 is 0.088753925984435, p value 0.7888504263193064
</pre>
 
4,107

edits