Statistics/Chi-squared distribution

From Rosetta Code
Statistics/Chi-squared distribution is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.


The probability density function (pdf) of the chi-squared (or χ2) distribution as used in statistics is

, where

Here, denotes the Gamma_function.

The use of the gamma function in the equation below reflects the chi-squared distribution's origin as a special case of the gamma distribution.

In probability theory and statistics, the gamma distribution is a two-parameter family of continuous probability distributions. The exponential distribution, Erlang distribution, and chi-square distribution are special cases of the gamma distribution.

The probability density function (pdf) of the gamma distribution is given by the formula

where Γ(k) is the Gamma_function, with shape parameter k and a scale parameter θ.

The cumulative probability distribution of the gamma distribution is the area under the curve of the distribution, which indicates the increasing probability of the x value of a single random point within the gamma distribution being less than or equal to the x value of the cumulative probability distribution. The gamma cumulative probability distribution function can be calculated as

where is the lower incomplete gamma function.

The lower incomplete gamma function can be calculated as

and so, for the chi-squared cumulative probability distribution and substituting chi-square k into s as k/2 and chi-squared x into x as x / 2,

Because series formulas may be subject to accumulated errors from rounding in the frequently used region where x and k are under 10 and near one another, you may instead use a mathematics function library, if available for your programming task, to calculate gamma and incomplete gamma.

Task
  • Calculate and show the values of the χ2(x; k) for k = 1 through 5 inclusive and x integer from 0 and through 10 inclusive.


  • Create a function to calculate the cumulative probability function for the χ2 distribution. This will need to be reasonably accurate (at least 6 digit accuracy) for k = 3.


  • Calculate and show the p values of statistical samples which result in a χ2(k = 3) value of 1, 2, 4, 8, 16, and 32. (Statistical p values can be calculated for the purpose of this task as approximately 1 - P(x), with P(x) the cumulative probability function at x for χ2.)


The following is a chart for 4 airports:


Flight Delays
Airport On Time Delayed Totals
Dallas/Fort Worth 77 23 100
Honolulu 88 12 100
LaGuardia 79 21 100
Orlando 81 19 100
All Totals 325 75 400
Expected per 100 81.25 18.75 100


χ2 on a 2D table is calculated as the sum of the squares of the differences from expected divided by the expected numbers for each entry on the table. The k for the chi-squared distribution is to be calculated as df, the degrees of freedom for the table, which is a 2D parameter, (count of airports - 1) * (count of measures per airport - 1), which here is (4 - 1 )(2 - 1) = 3.


  • Calculate the Chi-squared statistic for the above table and find its p value using your function for the cumulative probability for χ2 with k = 3 from the previous task.



Stretch task
  • Show how you could make a plot of the curves for the probability distribution function χ2(x; k) for k = 0, 1, 2, and 3.


Related Tasks
See also

[Chi Squared Test] [NIST page on the Chi-Square Distribution]





jq[edit]

Works with: jq

Also works with gojq, the Go implementation of jq.

The implementation of Chi_cdf here uses the recusive relation for the gamma function for efficiency and to simplify the avoidance of numerical overflow issues.

Formatting

def lpad($len): tostring | ($len - length) as $l | (" " * $l)[:$l] + .;

# For formatting numbers
# ... but leave non-numbers, 0, and numbers using E alone
def round($dec):
  def rpad($len): tostring | ($len - length) as $l | . + ("0" * $l);
   if type == "string" then .
   elif . == 0 then "0"
   else pow(10;$dec) as $m
   | . * $m | round / $m
   | tostring
   | (capture("(?<n>^[^.]*[.])(?<f>[0-9]*$)")
      | .n + (.f|rpad($dec)))
     // if test("^[0-9]+$") then . + "." + ("" | rpad($dec)) else null end
     // .
   end;

Chi-squared pdf and cdf

def Chi2_pdf($x; $k):
  if $x <= 0 then 0
  else # ((-$x/2)|exp) * pow($x; $k/2 -1) / (pow(2;$k/2) * (($k/2)|gamma))
         ((-$x/2) + ($k/2 -1) * ($x|log) - ( ($k/2)*(2|log) + (($k/2)|lgamma))) | exp
  end;

# $k is the degrees of freedom
# Use recursive relation for gamma: G(x+1) = x * G(x)
def Chi2_cdf($x; $k):
  if $x == 0 then 0
  elif $x > (1e3 * $k) then 1
  else 1e-15 as $tol  # for example
  | { s: 0, m: 0, term: (1 / ((($k/2)+1)|gamma)) }
  | until (.term|length < $tol;
      .s += .term
      | .m += 1
      | .term *= (($x/2) / (($k/2) + .m )) )
  | .s * ( ((-$x/2) + ($k/2)*(($x/2)|log)) | exp) 
  end ;

The Tasks

def tables:
  def r: round(6) | lpad(10);
  "    Values of the χ2 probability distribution function",
  " x/k    1         2         3         4         5",
  (range(0;11) as $x
   | "\($x|lpad(2)) \(reduce range(1; 6) as $k (""; . + (Chi2_pdf($x; $k) | r) ))" ),

  "\n    Values for χ2 with 3 degrees of freedom",
  "χ2     cum pdf   p-value",
  ( (1, 2, 4, 8, 16, 32) as $x
   | Chi2_cdf($x; 3) as $cdf
   | "\($x|lpad(2)) \($cdf|r)\(1-$cdf|r)"
  );
  
def airport:  [[77, 23], [88, 12], [79, 21], [81, 19]];
def expected: [81.25, 18.75];

def airport($airport; $expected):
  def dof: ($airport|length - 1) / ($airport[0]|length - 1);

  reduce range(0; $airport|length) as $i (0;
    reduce range(0; $airport[0]|length) as $j (.;
      . + (($airport[$i][$j] - $expected[$j]) | .*.) / $expected[$j] ) )
  | ("\nFor airport data table:",
     "  diff sum : \(.)",
     "  d.o.f.   : \(dof)",
     "  χ2 value : \(Chi2_pdf(.; dof))",
     "  p-value  : \(Chi2_cdf(.; dof)|round(4))" ) ;

tables,
 airport(airport; expected)
Output:
    Values of the χ2 probability distribution function
 x/k    1         2         3         4         5
 0          0         0         0         0         0
 1   0.241971  0.303265  0.241971  0.151633  0.080657
 2   0.103777  0.183940  0.207554  0.183940  0.138369
 3   0.051393  0.111565  0.154180  0.167348  0.154180
 4   0.026995  0.067668  0.107982  0.135335  0.143976
 5   0.014645  0.041042  0.073225  0.102606  0.122042
 6   0.008109  0.024894  0.048652  0.074681  0.097304
 7   0.004553  0.015099  0.031873  0.052845  0.074371
 8   0.002583  0.009158  0.020667  0.036631  0.055112
 9   0.001477  0.005554  0.013296  0.024995  0.039887
10   0.000850  0.003369  0.008500  0.016845  0.028335

    Values for χ2 with 3 degrees of freedom
χ2     cum pdf   p-value
 1   0.198748  0.801252
 2   0.427593  0.572407
 4   0.738536  0.261464
 8   0.953988  0.046012
16   0.998866  0.001134
32   0.999999     1e-06

For airport data table:
  diff sum : 4.512820512820513
  d.o.f.   : 3
  χ2 value : 0.088753925984435
  p-value  : 0.7889

Julia[edit]

""" Rosetta Code task rosettacode.org/wiki/Statistics/Chi-squared_distribution """


""" gamma function to 12 decimal places """
function gamma(x)
    p = [ 0.99999999999980993, 676.5203681218851, -1259.1392167224028,
          771.32342877765313, -176.61502916214059, 12.507343278686905,
          -0.13857109526572012, 9.9843695780195716e-6, 1.5056327351493116e-7 ]
    if x < 0.5
        return π / (sinpi(x) * gamma(1.0 - x))
    else
        x -= 1.0
        t = p[1]
        for i in 1:8
            t += p[i+1] / (x + i)
        end
    end
    w = x + 7.5
    return sqrt(2.0 * π) * w^(x+0.5) * exp(-w) * t
end

""" Chi-squared function, the probability distribution function (pdf) for chi-squared """
function χ2(x, k)
    return x > 0 ? x^(k/2 - 1) * exp(-x/2) / (2^(k/2) * gamma(k / 2)) : 0
end

""" lower incomplete gamma by series formula with gamma """
function gamma_cdf(k, x)
    return x^k * exp(-x) * sum(x^m / gamma(k + m + 1) for m in 0:100)
end

""" Cumulative probability function (cdf) for chi-squared """
function cdf_χ2(x, k)
    return x <= 0 || k <= 0 ? 0.0 : gamma_cdf(k / 2, x / 2)
end

println("x           χ2 k = 1             k = 2             k = 3             k = 4             k = 5")
println("-"^93)
for x in 0:10
      print(lpad(x, 2))
      for k in 1:5
        s = string(χ2(x, k))
        print(lpad(s[1:min(end, 13)], 18), k % 5 == 0 ? "\n" : "")
      end
end

println("\nχ2 x     cdf for χ2   P value (df=3)\n", "-"^36)
for p in [1, 2, 4, 8, 16, 32]
    cdf = round(cdf_χ2(p, 3), digits=10)
    println(lpad(p, 2), "     ", cdf, "   ",  round(1.0 - cdf, digits=10))
end

airportdata = [ 77 23 ;
                88 12;
                79 21;
                81 19 ]

expected_data = [ 81.25 18.75 ;
                  81.25 18.75 ;
                  81.25 18.75 ;
                  81.25 18.75 ; ]

dtotal = sum((airportdata[i] - expected_data[i])^2/ expected_data[i] for i in 1:length(airportdata))

println("\nFor the airport data, diff total is $dtotal, χ2 is ", χ2(dtotal, 3), ", p value ", cdf_χ2(dtotal, 3))

using Plots
x = 0.0:0.01:10
y = [map(p -> χ2(p, k), x) for k in 0:3]

plot(x, y, yaxis=[-0.1, 0.5], labels=[0 1 2 3])
Output:
  • Graph of Chi-Squared for k values 0 through 3
x           χ2 k = 1             k = 2             k = 3             k = 4             k = 5
---------------------------------------------------------------------------------------------
 0                 0                 0                 0                 0                 0
 1     0.24197072451     0.30326532985     0.24197072451     0.15163266492     0.08065690817
 2     0.10377687435     0.18393972058     0.20755374871     0.18393972058     0.13836916580
 3     0.05139344326     0.11156508007     0.15418032980     0.16734762011     0.15418032980
 4     0.02699548325     0.06766764161     0.10798193302     0.13533528323     0.14397591070
 5     0.01464498256     0.04104249931     0.07322491280     0.10260624827     0.12204152134
 6     0.00810869555     0.02489353418     0.04865217332     0.07468060255     0.09730434665
 7     0.00455334292     0.01509869171     0.03187340045     0.05284542098     0.07437126772
 8     0.00258337316     0.00915781944     0.02066698535     0.03663127777     0.05511196094
 9     0.00147728280     0.00555449826     0.01329554523     0.02499524221     0.03988663570
10     0.00085003666     0.00336897349     0.00850036660     0.01684486749     0.02833455534

χ2 x     cdf for χ2   P value (df=3)
------------------------------------
 1     0.1987480431   0.8012519569
 2     0.4275932955   0.5724067045
 4     0.7385358701   0.2614641299
 8     0.9539882943   0.0460117057
16     0.9988660157   0.0011339843
32     0.9999994767   5.233e-7

For the airport data, diff total is 4.512820512820512, χ2 is 0.08875392598443503, p value 0.7888504263193064

Perl[edit]

Translation of: Raku
use v5.36;
use Math::MPFR;
use List::Util 'sum';

sub Gamma ($z) {
    my $g = Math::MPFR->new();
    Math::MPFR::Rmpfr_gamma($g, Math::MPFR->new($z), 0);
    $g;
}

sub chi2($x,$k)      { $x>0 && $k>0 ? ($x**($k/2 - 1) * exp(-$x/2)/(2**($k/2)*Gamma($k / 2))) : 0 }
sub gamma_cdf($k,$x) { $x**$k * exp(-$x) * sum map { $x** $_ / Gamma($k+$_+1) } 0..100 }
sub cdf_chi2($x,$k)  { ($x <= 0 or $k <= 0) ? 0.0 : gamma_cdf($k / 2, $x / 2) }

print 'x           χ² ';
print "k = $_" . ' 'x13 for 1..5;
say "\n" . '-' x (my $width = 93);

for my $x (0..10) {
    printf '%2d', $x;
    printf '  %.' . (int(($width-2)/5)-4) . 'f', chi2($x, $_) for 1..5;
    say '';
}

say "\nχ² x     cdf for χ²   P value (df=3)\n" . '-' x 36;
for my $p (map { 2**$_ } 0..5)  {
    my $cdf = cdf_chi2($p, 3);
    printf "%2d     %-.10f   %-.10f\n", $p, $cdf, 1-$cdf;
}

my @airport  = <77 23 88 12 79 21 81 19>;
my @expected = split ' ', '81.25 18.75 ' x 4;
my $dtotal;
$dtotal += ($airport[$_] - $expected[$_])**2 / $expected[$_] for 0..$#airport;
printf "\nFor the airport data, diff total is %.5f, χ² is %.5f, p value %.5f\n", $dtotal, chi2($dtotal, 3), cdf_chi2($dtotal, 3);
Output:
x           χ² k = 1             k = 2             k = 3             k = 4             k = 5
---------------------------------------------------------------------------------------------
 0  0.00000000000000  0.00000000000000  0.00000000000000  0.00000000000000  0.00000000000000
 1  0.24197072451914  0.30326532985632  0.24197072451914  0.15163266492816  0.08065690817305
 2  0.10377687435515  0.18393972058572  0.20755374871030  0.18393972058572  0.13836916580686
 3  0.05139344326792  0.11156508007421  0.15418032980377  0.16734762011132  0.15418032980377
 4  0.02699548325659  0.06766764161831  0.10798193302638  0.13533528323661  0.14397591070183
 5  0.01464498256193  0.04104249931195  0.07322491280963  0.10260624827987  0.12204152134939
 6  0.00810869555494  0.02489353418393  0.04865217332964  0.07468060255180  0.09730434665928
 7  0.00455334292164  0.01509869171116  0.03187340045148  0.05284542098906  0.07437126772012
 8  0.00258337316926  0.00915781944437  0.02066698535409  0.03663127777747  0.05511196094425
 9  0.00147728280398  0.00555449826912  0.01329554523581  0.02499524221105  0.03988663570744
10  0.00085003666025  0.00336897349954  0.00850036660252  0.01684486749771  0.02833455534173

χ² x     cdf for χ²   P value (df=3)
------------------------------------
 1     0.1987480431   0.8012519569
 2     0.4275932955   0.5724067045
 4     0.7385358701   0.2614641299
 8     0.9539882943   0.0460117057
16     0.9988660157   0.0011339843
32     0.9999994767   0.0000005233

For the airport data, diff total is 4.51282, χ² is 0.08875, p value 0.78885

Phix[edit]

Translation of: Julia
Library: Phix/online

You can run this online here.

--
-- demo\rosetta\Chi-squared_distribution.exw
--
with javascript_semantics
function gamma(atom z)
    constant p = {   0.99999999999980993, 
                   676.5203681218851,   
                 -1259.1392167224028, 
                   771.32342877765313,  
                  -176.61502916214059,  
                    12.507343278686905, 
                    -0.13857109526572012,
                     9.9843695780195716e-6,
                     1.5056327351493116e-7 }
    if z<0.5 then
        return PI / (sin(PI*z)*gamma(1-z))
    end if
    z -= 1;
    atom x := p[1];
    for i=1 to length(p)-1 do x += p[i+1]/(z+i) end for
    atom t = z + length(p) - 1.5;
    return sqrt(2*PI) * power(t,z+0.5) * exp(-t) * x
end function

function chi_squared(atom x, k)
-- Chi-squared function, the probability distribution function (pdf) for chi-squared
    return iff(x > 0 ? power(x,k/2-1) * exp(-x/2) / (power(2,k/2) * gamma(k / 2)) : 0)
end function

function gamma_cdf(atom k, x)
-- lower incomplete gamma by series formula with gamma
    atom tot = 0
    for m=0 to 100 do
        tot += power(x,m) / gamma(k + m + 1)
    end for
    return power(x,k) * exp(-x) * tot
end function

function cdf_chi_squared(atom x, k)
-- Cumulative probability function (cdf) for chi-squared
    return iff(x<=0 or k<=0 ? 0.0 : gamma_cdf(k/2, x/2))
end function

printf(1,"       ------------------------------------ Chi-squared ------------------------------------\n")
printf(1," x             k = 1             k = 2             k = 3             k = 4             k = 5\n")
printf(1,repeat('-',92)&"\n")
for x=0 to 10 do
    printf(1,"%2d",x)
    for k=1 to 5 do
        printf(1,"%18.11f%n",{chi_squared(x, k),k=5})
    end for
end for

printf(1,"\nChi_squared x     P value (df=3)\n------------------------------------\n")
for p in {1, 2, 4, 8, 16, 32} do
    printf(1,"      %2d          %.16g\n",{p, 1-cdf_chi_squared(p, 3)})
end for

constant airportdata = { 77, 23,
                         88, 12,
                         79, 21,
                         81, 19 },

       expected_data = { 81.25, 18.75,
                         81.25, 18.75,
                         81.25, 18.75,
                         81.25, 18.75 },

fmt = "\n"&"""
For the airport data, diff total is %.15f,
              degrees of freedom is %d,
                      ch-squared is %.15f, 
                         p value is %.15f
"""
integer df = length(airportdata)/2-1
atom dtotal = sum(sq_div(sq_power(sq_sub(airportdata,expected_data),2),expected_data))
printf(1,fmt,{dtotal, df, chi_squared(dtotal,df), cdf_chi_squared(dtotal, df)})

include IupGraph.e

function get_data(Ihandle /*graph*/)
    constant colours = {CD_BLUE,CD_ORANGE,CD_GREEN,CD_RED,CD_PURPLE}
    sequence x = sq_div(tagset(999,0),100),
             xy = {{"NAMES",{"0","1","2","3","4"}}}
    for k=0 to 4 do
        xy = append(xy,{x,apply(true,chi_squared,{x,k}),colours[k+1]})
    end for
    return xy
end function

IupOpen()
Ihandle graph = IupGraph(get_data,`RASTERSIZE=340x180,GRID=NO`)
IupSetAttributes(graph,`XMAX=10,XTICK=2,XMARGIN=10,YMAX=0.5,YTICK=0.1`)
IupShow(IupDialog(graph,`TITLE="Chi-squared distribution",MINSIZE=260x200`))
if platform()!=JS then
    IupMainLoop()
    IupClose()
end if
Output:
       ------------------------------------ Chi-squared ------------------------------------
 x             k = 1             k = 2             k = 3             k = 4             k = 5
--------------------------------------------------------------------------------------------
 0     0.00000000000     0.00000000000     0.00000000000     0.00000000000     0.00000000000
 1     0.24197072452     0.30326532986     0.24197072452     0.15163266493     0.08065690817
 2     0.10377687436     0.18393972059     0.20755374871     0.18393972059     0.13836916581
 3     0.05139344327     0.11156508007     0.15418032980     0.16734762011     0.15418032980
 4     0.02699548326     0.06766764162     0.10798193303     0.13533528324     0.14397591070
 5     0.01464498256     0.04104249931     0.07322491281     0.10260624828     0.12204152135
 6     0.00810869555     0.02489353418     0.04865217333     0.07468060255     0.09730434666
 7     0.00455334292     0.01509869171     0.03187340045     0.05284542099     0.07437126772
 8     0.00258337317     0.00915781944     0.02066698535     0.03663127778     0.05511196094
 9     0.00147728280     0.00555449827     0.01329554524     0.02499524221     0.03988663571
10     0.00085003666     0.00336897350     0.00850036660     0.01684486750     0.02833455534

Chi_squared x     P value (df=3)
------------------------------------
       1          0.8012519569012007
       2          0.5724067044708797
       4          0.2614641299491101
       8          0.0460117056892316
      16          0.0011339842897863
      32          5.233466446874501e-7

For the airport data, diff total is 4.512820512820513,
              degrees of freedom is 3,
                      ch-squared is 0.088753925984435,
                         p value is 0.788850426319307
Phixplotchisquared.png

Python[edit]

''' rosettacode.org/wiki/Statistics/Chi-squared_distribution#Python '''


from math import exp, pi, sin, sqrt
from matplotlib.pyplot import plot, legend, ylim


def gamma(x):
    ''' gamma function, accurate to about 12 decimal places '''
    p = [0.99999999999980993, 676.5203681218851, -1259.1392167224028,
         771.32342877765313, -176.61502916214059, 12.507343278686905,
         -0.13857109526572012, 9.9843695780195716e-6, 1.5056327351493116e-7]
    if x < 0.5:
        return pi / (sin(pi * x) * gamma(1.0 - x))
    x -= 1.0
    t = p[0]
    for i in range(1, 9):
        t += p[i] / (x + i)

    w = x + 7.5
    return sqrt(2.0 * pi) * w**(x+0.5) * exp(-w) * t


def χ2(x, k):
    ''' Chi-squared function, the probability distribution function (pdf) for chi-squared '''
    return x**(k/2 - 1) * exp(-x/2) / (2**(k/2) * gamma(k / 2)) if x > 0 and k > 0 else 0.0


def gamma_cdf(k, x):
    ''' lower incomplete gamma by series formula with gamma '''
    return x**k * exp(-x) * sum(x**m / gamma(k + m + 1) for m in range(100))


def cdf_χ2(x, k):
    ''' Cumulative probability function (cdf) for chi-squared '''
    return gamma_cdf(k / 2, x / 2) if x > 0 and k > 0 else 0.0


print('x         χ2 k = 1           k = 2           k = 3           k = 4           k = 5')
print('-' * 93)
for x in range(11):
    print(f'{x:2}', end='')
    for k in range(1, 6):
        print(f'{χ2(x, k):16.8}', end='\n' if k % 5 == 0 else '')


print('\nχ2 x     P value (df=3)\n----------------------')
for p in [1, 2, 4, 8, 16, 32]:
    print(f'{p:2}', '    ', 1.0 - cdf_χ2(p, 3))


AIRPORT_DATA = [[77, 23], [88, 12], [79, 21], [81, 19]]

EXPECTED = [[81.25, 18.75],
            [81.25, 18.75],
            [81.25, 18.75],
            [81.25, 18.75]]

DTOTAL = sum((d[pos] - EXPECTED[i][pos])**2 / EXPECTED[i][pos]
             for i, d in enumerate(AIRPORT_DATA) for pos in [0, 1])

print(
    f'\nFor the airport data, diff total is {DTOTAL}, χ2 is {χ2(DTOTAL, 3)}, p value {cdf_χ2(DTOTAL, 3)}')
X = [x * 0.001 for x in range(10000)]
for k in range(5):
    plot(X, [χ2(p, k) for p in X])
legend([0, 1, 2, 3, 4])
ylim(-0.02, 0.5)
Output:
Pyplotofchisquared.png
x         χ2 k = 1           k = 2           k = 3           k = 4           k = 5
---------------------------------------------------------------------------------------------
 0             0.0             0.0             0.0             0.0             0.0
 1      0.24197072      0.30326533      0.24197072      0.15163266     0.080656908
 2      0.10377687      0.18393972      0.20755375      0.18393972      0.13836917
 3     0.051393443      0.11156508      0.15418033      0.16734762      0.15418033
 4     0.026995483     0.067667642      0.10798193      0.13533528      0.14397591
 5     0.014644983     0.041042499     0.073224913      0.10260625      0.12204152
 6    0.0081086956     0.024893534     0.048652173     0.074680603     0.097304347
 7    0.0045533429     0.015098692       0.0318734     0.052845421     0.074371268
 8    0.0025833732    0.0091578194     0.020666985     0.036631278     0.055111961
 9    0.0014772828    0.0055544983     0.013295545     0.024995242     0.039886636
10   0.00085003666    0.0033689735    0.0085003666     0.016844867     0.028334555

χ2 x     P value (df=3)
----------------------
 1      0.8012519569012009
 2      0.5724067044708798
 4      0.26146412994911117
 8      0.04601170568923141
16      0.0011339842897852837
32      5.233466447984725e-07

For the airport data, diff total is 4.512820512820513, χ2 is 0.088753925984435, p value 0.7888504263193064

Raku[edit]

Translation of: Julia
# 20221101 Raku programming solution

use Graphics::PLplot;

sub Γ(\z) { # https://rosettacode.org/wiki/Gamma_function#Raku
    constant g = 9;
    z < .5 ?? π/ sin(π * z) / Γ(1 - z) 
           !! sqrt(2*π) * (z + g - 1/2)**(z - 1/2) * exp(-(z + g - 1/2)) *
              [+] < 1.000000000000000174663        5716.400188274341379136
                    -14815.30426768413909044       14291.49277657478554025
                    -6348.160217641458813289       1301.608286058321874105
                    -108.1767053514369634679       2.605696505611755827729
                    -0.7423452510201416151527e-2   0.5384136432509564062961e-7
                    -0.4023533141268236372067e-8 > Z* 1, |map 1/(z + *), 0..*
}

sub χ2(\x,\k) {x>0 && k>0 ?? (x**(k/2 - 1)*exp(-x/2)/(2**(k/2)*Γ(k / 2))) !! 0}
 
sub Γ_cdf(\k,\x) { x**k * exp(-x) * sum( ^101 .map: { x** $_ / Γ(k+$_+1) } ) }

sub cdf_χ2(\x,\k) { (x <= 0 or k <= 0) ?? 0.0 !!  Γ_cdf(k / 2, x / 2) }

say ' 𝒙          χ² ', [~] (1..5)».&{ "𝒌 = $_" ~ ' ' x 13 };
say '-' x my \width = 93;
for 0..10 -> \x {
   say x.fmt('%2d'), [~] (15)».&{χ2(x, $_).fmt: "  %-.{((width-2) div 5)-4}f"}
}

say "\nχ² 𝒙     cdf for χ²   P value (df=3)\n", '-' x 36;
for 2 «**« ^6 -> \p {
   my $cdf = cdf_χ2(p, 3).fmt: '%-.10f';
   say p.fmt('%2d'), "     $cdf   ", (1-$cdf).fmt: '%-.10f'
}

my \airport   = [ <77 23>, <88 12>, <79 21>, <81 19> ];
my \expected  = [ <81.25 18.75> xx 4 ];
my \dtotal    = ( (airport »-« expected² »/» expected )».List.flat.sum;

say "\nFor the airport data, diff total is ",dtotal,", χ² is ", χ2(dtotal, 3), ", p value ", cdf_χ2(dtotal, 3);

given Graphics::PLplot.new( device => 'png', file-name => 'output.png' ) {
   .begin;
   .pen-width: 3 ;
   .environment: x-range => [-1.0, 10.0], y-range => [-0.1, 0.5], just => 0 ;
   .label: x-axis => '', y-axis => '', title => 'Chi-squared distribution' ;
   for 0..3 -> \𝒌 {  
      .color-index0: 1+2*𝒌;
      .line: (0, .110).map: -> \𝒙 { ( 𝒙, χ2( 𝒙, 𝒌 ) )».Num };
      .text-viewport: side=>'t', disp=>-𝒌-2, pos=>.5, just=>.5, text=>'k = '~𝒌 
   } # plplot.sourceforge.net/docbook-manual/plplot-html-5.15.0/plmtex.html
   .end 
}
Output:
 𝒙          χ² 𝒌 = 1             𝒌 = 2             𝒌 = 3             𝒌 = 4             𝒌 = 5             
---------------------------------------------------------------------------------------------
 0  0.00000000000000  0.00000000000000  0.00000000000000  0.00000000000000  0.00000000000000
 1  0.24197072451914  0.30326532985632  0.24197072451914  0.15163266492816  0.08065690817305
 2  0.10377687435515  0.18393972058572  0.20755374871030  0.18393972058572  0.13836916580686
 3  0.05139344326792  0.11156508007421  0.15418032980377  0.16734762011132  0.15418032980377
 4  0.02699548325659  0.06766764161831  0.10798193302638  0.13533528323661  0.14397591070183
 5  0.01464498256193  0.04104249931195  0.07322491280963  0.10260624827987  0.12204152134939
 6  0.00810869555494  0.02489353418393  0.04865217332964  0.07468060255180  0.09730434665928
 7  0.00455334292164  0.01509869171116  0.03187340045148  0.05284542098906  0.07437126772012
 8  0.00258337316926  0.00915781944437  0.02066698535409  0.03663127777747  0.05511196094425
 9  0.00147728280398  0.00555449826912  0.01329554523581  0.02499524221105  0.03988663570744
10  0.00085003666025  0.00336897349954  0.00850036660252  0.01684486749771  0.02833455534173

χ² 𝒙     cdf for χ²   P value (df=3)
------------------------------------
 1     0.1987480431   0.8012519569
 2     0.4275932955   0.5724067045
 4     0.7385358701   0.2614641299
 8     0.9539882943   0.0460117057
16     0.9988660157   0.0011339843
32     0.9999994767   0.0000005233

For the airport data, diff total is 4.512821, χ² is 0.08875392598443506, p value 0.7888504263193072
RakuPlotChiSquared.png

Wren[edit]

Library: DOME
Library: Wren-math
Library: Wren-trait
Library: Wren-fmt
Library: Wren-plot
import "dome" for Window
import "graphics" for Canvas, Color
import "./math2" for Math
import "./trait" for Stepped
import "./fmt" for Fmt
import "./plot" for Axes

class Chi2 {
    static pdf(x, k) {
        if (x <= 0) return 0
        return (-x/2).exp * x.pow(k/2-1) / (2.pow(k/2) * Math.gamma(k/2))
    }

    static cpdf(x, k) {
        var t = (-x/2).exp * (x/2).pow(k/2)
        var s = 0
        var m = 0
        var tol = 1e-15 // say
        while (true) {
            var term = (x/2).pow(m) / Math.gamma(k/2 + m + 1)
            s = s + term
            if (term.abs < tol) break
            m = m + 1
        }
        return t * s
    }
}

System.print("    Values of the χ2 probability distribution function")
System.print(" x/k    1         2         3         4         5")
for (x in 0..10) {
    Fmt.write("$2d  ", x)
    for (k in 1..5) {
        Fmt.write("$f  ", Chi2.pdf(x, k))
    }
    System.print()
}

System.print("\n    Values for χ2 with 3 degrees of freedom")
System.print("χ2  cum pdf   p-value")
for (x in [1, 2, 4, 8, 16, 32]) {
    var cpdf = Chi2.cpdf(x, 3)
    Fmt.print("$2d  $f  $f", x,  cpdf, 1-cpdf)
}

var airport = [[77, 23], [88, 12], [79, 21], [81, 19]]
var expected = [81.25, 18.75]
var dsum = 0
for (i in 0...airport.count) {
    for (j in 0...airport[0].count) {
        dsum = dsum + (airport[i][j] - expected[j]).pow(2) / expected[j]
    }
}
var dof = (airport.count - 1) / (airport[0].count - 1)
System.print("\nFor airport data table: ")
Fmt.print("  diff sum : $f", dsum)
Fmt.print("  d.o.f.   : $d", dof)
Fmt.print("  χ2 value : $f", Chi2.pdf(dsum, dof))
Fmt.print("  p-value  : $f", Chi2.cpdf(dsum, dof))

// generate points for plot
var Pts = List.filled(5, null)
for (k in 0..4) {
    Pts[k] = []
    var x = 0
    while (x < 10) {
        Pts[k].add([x, 10 * Chi2.pdf(x, k)])
        x = x + 0.01
    }
}

class Main {
    construct new() {
        Window.title = "Chi-squared distribution for k in [0, 4]"
        Canvas.resize(1000, 600)
        Window.resize(1000, 600)
        Canvas.cls(Color.white)
        var axes = Axes.new(100, 500, 800, 400, -0.5..10, -0.5..5)
        axes.draw(Color.black, 2)
        var xMarks = 0..10
        var yMarks = 0..5
        axes.mark(xMarks, yMarks, Color.black, 2)
        var xMarks2 = Stepped.new(0..10, 2)
        var yMarks2 = 0..5
        axes.label(xMarks2, yMarks2, Color.black, 2, Color.black, 1, 10)
        var colors = [Color.blue, Color.yellow, Color.green, Color.red, Color.purple]
        for (k in 0..4) {
            axes.lineGraph(Pts[k], colors[k], 2)
        }
        axes.rect(8, 5, 120, 110, Color.black)
        for (k in 0..4) {
            var y = 4.75 - k * 0.25
            axes.line(8.2, y, 9, y, colors[k], 2)
            y = 385 - k * 18
            axes.print(750, y, k.toString, Color.black)
        }
    }

    init() {}

    update() {}

    draw(alpha) {}
}

var Game = Main.new()
Output:

Terminal output:

    Values of the χ2 probability distribution function
 x/k    1         2         3         4         5
 0  0.000000  0.000000  0.000000  0.000000  0.000000  
 1  0.241971  0.303265  0.241971  0.151633  0.080657  
 2  0.103777  0.183940  0.207554  0.183940  0.138369  
 3  0.051393  0.111565  0.154180  0.167348  0.154180  
 4  0.026995  0.067668  0.107982  0.135335  0.143976  
 5  0.014645  0.041042  0.073225  0.102606  0.122042  
 6  0.008109  0.024894  0.048652  0.074681  0.097304  
 7  0.004553  0.015099  0.031873  0.052845  0.074371  
 8  0.002583  0.009158  0.020667  0.036631  0.055112  
 9  0.001477  0.005554  0.013296  0.024995  0.039887  
10  0.000850  0.003369  0.008500  0.016845  0.028335  

    Values for χ2 with 3 degrees of freedom
χ2  cum pdf   p-value
 1  0.198748  0.801252
 2  0.427593  0.572407
 4  0.738536  0.261464
 8  0.953988  0.046012
16  0.998866  0.001134
32  0.999999  0.000001

For airport data table: 
  diff sum : 4.512821
  d.o.f.   : 3
  χ2 value : 0.088754
  p-value  : 0.788850
Wren chisquared.png