Welch's t-test: Difference between revisions

Added FreeBASIC
m (→‎Using GSL: Rm blank lines)
(Added FreeBASIC)
 
(4 intermediate revisions by 4 users not shown)
Line 69:
{{trans|Python}}
 
<langsyntaxhighlight lang="11l">F betain(x, p, q)
I p <= 0 | q <= 0 | x < 0 | x > 1
X.throw ValueError(0)
 
I x == 0 | x == 1
Line 123:
V n2 = a2.len
I n1 <= 1 | n2 <= 1
X.throw ValueError(0)
 
V mean1 = sum(a1) / n1
Line 139:
V a1 = [Float(3), 4, 1, 2.1]
V a2 = [Float(490.2), 340, 433.9]
print(welch_ttest(a1, a2))</langsyntaxhighlight>
 
{{out}}
Line 161:
This shows how pvalue can be calculated from any two arrays, using Welch's 2-sided t-test, which doesn't assume equal variance.
This is the equivalent of R's<code>t.test(vector1,vector2, alternative="two.sided", var.equal=FALSE)</code> and as such, it is compared against R's pvalues with the same vectors/arrays to show that the differences are very small (here 10^-14).
<langsyntaxhighlight Clang="c">#include <stdio.h>
#include <math.h>
#include <stdlib.h>
Line 428:
return 0;
}
</syntaxhighlight>
</lang>
 
{{out}}
Line 443:
'''If''' your computer does not have <code>lgammal</code>, add the following function before <code>main</code> and replace <code>lgammal</code> with <code>lngammal</code> in the <code>calculate_Pvalue</code> function:
 
<langsyntaxhighlight Clang="c">#include <stdio.h>
#include <math.h>
 
Line 463:
}
 
</syntaxhighlight>
</lang>
 
=={{header|Fortran}}==
Line 470:
Alternatively, the program shows the p-value computed using the IMSL '''BETAI''' function.
 
<langsyntaxhighlight lang="fortran">subroutine welch_ttest(n1, x1, n2, x2, t, df, p)
use tdf_int
implicit none
Line 495:
print *, t, df, p
print *, betai(df / (t**2 + df), 0.5d0 * df, 0.5d0)
end program</langsyntaxhighlight>
 
'''Output'''
Line 505:
With Absoft Pro Fortran, compile with <code>af90 -m64 pvalue.f90 %SLATEC_LINK%</code>.
 
<langsyntaxhighlight lang="fortran">subroutine welch_ttest(n1, x1, n2, x2, t, df, p)
implicit none
integer :: n1, n2
Line 530:
call welch_ttest(4, x, 3, y, t, df, p)
print *, t, df, p
end program</langsyntaxhighlight>
 
'''Output'''
Line 542:
 
Instead of implementing the t-distribution by ourselves, we bind to GNU Scientific Library:
<langsyntaxhighlight lang="fortran">module t_test_m
 
use, intrinsic :: iso_c_binding
Line 596:
print *, t, df, p
 
end program main</langsyntaxhighlight>
 
'''Output'''
 
<pre> -9.5594977219326580 2.0008523488562844 1.0751561149784494E-002</pre>
 
=={{header|FreeBASIC}}==
===Using Betain===
{{trans|11l}}
<syntaxhighlight lang="vbnet">#include "crt\math.bi"
 
Function betain(x As Double, p As Double, q As Double) As Double
If p <= 0 Or q <= 0 Or x < 0 Or x > 1 Then
Print "ValueError"
End
End If
If x = 0 Or x = 1 Then Return x
Dim As Double acu = 1e-15
'Dim As Double lnbeta = LogGamma(p) + LogGamma(q) - LogGamma(p + q)
Dim As Double lnbeta = lGamma(p) + lGamma(q) - lGamma(p + q)
Dim As Double xx = x
Dim As Double cx = 1 - x
Dim As Double pp = p
Dim As Double qq = q
Dim As Integer indx = 0
Dim As Double psq = p + q
If p < psq * x Then
xx = 1 - x
cx = x
pp = q
qq = p
indx = 1
End If
Dim As Double term = 1.0
Dim As Double ai = 1.0
Dim As Double value = 1.0
Dim As Integer ns = Int(qq + cx * psq)
Dim As Double rx = xx / cx
Dim As Double temp = qq - ai
If ns = 0 Then rx = xx
Do
term *= temp * rx / (pp + ai)
value += term
temp = Abs(term)
If temp <= acu And temp <= acu * value Then
value *= Exp(pp * Log(xx) + (qq - 1) * Log(cx) - lnbeta) / pp
Return Iif(indx, 1 - value, value)
End If
ai += 1
If ns > 0 Then
ns -= 1
temp = qq - ai
If ns = 0 Then
rx = xx
Else
temp = psq
psq += 1
End If
End If
Loop
End Function
 
Sub welch_ttest(a1() As Double, a2() As Double, Byref t As Double, Byref df As Double, Byref p As Double)
Dim As Integer n1 = Ubound(a1) + 1
Dim As Integer n2 = Ubound(a2) + 1
If n1 <= 1 Or n2 <= 1 Then
Print "ValueError"
End
End If
 
Dim As Double mean1 = 0
For i As Integer = 0 To n1 - 1
mean1 += a1(i)
Next i
mean1 /= n1
 
Dim As Double mean2 = 0
For i As Integer = 0 To n2 - 1
mean2 += a2(i)
Next i
mean2 /= n2
 
Dim As Double var1 = 0
For i As Integer = 0 To n1 - 1
var1 += (a1(i) - mean1) ^ 2
Next i
var1 /= (n1 - 1)
 
Dim As Double var2 = 0
For i As Integer = 0 To n2 - 1
var2 += (a2(i) - mean2) ^ 2
Next i
var2 /= (n2 - 1)
 
t = (mean1 - mean2) / Sqr(var1 / n1 + var2 / n2)
df = (var1 / n1 + var2 / n2) ^ 2 / (var1 ^ 2 / (n1 ^ 2 * (n1 - 1)) + var2 ^ 2 / (n2 ^ 2 * (n2 - 1)))
p = betain(df / (t ^ 2 + df), df / 2, 1 / 2)
End Sub
 
Dim As Double a1(3) = {3, 4, 1, 2.1}
Dim As Double a2(2) = {490.2, 340, 433.9}
Dim As Double t, df, p
welch_ttest(a1(), a2(), t, df, p)
Print " t: "; t
Print "df: "; df
Print " p: "; p
 
Sleep</syntaxhighlight>
{{out}}
<pre> t: -9.559497721932658
df: 2.000852348856284
p: 0.01075155600241868</pre>
 
=={{header|Go}}==
<langsyntaxhighlight lang="go">package main
 
import (
Line 694 ⟶ 808:
func(r float64) float64 { return math.Pow(r, ν/2-1) / math.Sqrt(1-r) }) /
math.Exp(g1+g2-g3)
}</langsyntaxhighlight>
{{out}}
<pre>
Line 708 ⟶ 822:
Implementation:
 
<langsyntaxhighlight Jlang="j">integrate=: adverb define
'a b steps'=. 3{.y,128
size=. (b - a)%steps
Line 731 ⟶ 845:
hi=. v%(t^2)+v
(F f. simpson integrate lo,hi) % 0.5 B v%2
)</langsyntaxhighlight>
 
<code>integrate</code> and <code>simpson</code> are from the [[Numerical_integration#J|Numerical integration]] task.
Line 748 ⟶ 862:
 
Data for task examples:
<langsyntaxhighlight Jlang="j">d1=: 27.5 21 19 23.6 17 17.9 16.9 20.1 21.9 22.6 23.1 19.6 19 21.7 21.4
d2=: 27.1 22 20.8 23.4 23.4 23.5 25.8 22 24.8 20.2 21.9 22.1 22.9 20.5 24.4
d3=: 17.2 20.9 22.6 18.1 21.7 21.4 23.5 24.2 14.7 21.8
Line 757 ⟶ 871:
d8=: 29.89 29.93 29.72 29.98 30.02 29.98
d9=: 3 4 1 2.1
da=: 490.2 340 433.9</langsyntaxhighlight>
 
Task examples:
<langsyntaxhighlight Jlang="j"> d1 p2_tail d2
0.021378
d3 p2_tail d4
Line 769 ⟶ 883:
0.0907733
d9 p2_tail da
0.0107377</langsyntaxhighlight>
 
=={{header|Java}}==
Using the '''[http://commons.apache.org/proper/commons-math/ Apache Commons Mathematics Library]'''.
<langsyntaxhighlight lang="java">import org.apache.commons.math3.distribution.TDistribution;
 
public class WelchTTest {
Line 822 ⟶ 936:
System.out.println("p = " + res[2]);
}
}</langsyntaxhighlight>
 
'''Result'''
Line 840 ⟶ 954:
 
jq's `lgamma` returns the natural logarithm of the absolute value of the gamma function of x.
<langsyntaxhighlight lang="jq">def mean: add / length;
 
# Sample variance using division by (length-1)
Line 900 ⟶ 1,014:
pValue(d5; d6),
pValue(d7; d8),
pValue(x; y)</langsyntaxhighlight>
{{out}}
<pre>
Line 914 ⟶ 1,028:
{{works with|Julia|0.6}}
 
<langsyntaxhighlight lang="julia">using HypothesisTests
 
d1 = [27.5, 21.0, 19.0, 23.6, 17.0, 17.9, 16.9, 20.1, 21.9, 22.6, 23.1, 19.6, 19.0, 21.7, 21.4]
Line 934 ⟶ 1,048:
ttest = UnequalVarianceTTest(y1, y2)
println("\nData:\n y1 = $y1\n y2 = $y2\nP-value for unequal variance TTest: ", round(pvalue(ttest), 4))
end</langsyntaxhighlight>
 
{{out}}
Line 966 ⟶ 1,080:
=={{header|Kotlin}}==
This program brings in code from other tasks for gamma functions and integration by Simpson's rule as Kotlin doesn't have these built-in:
<langsyntaxhighlight lang="scala">// version 1.1.4-3
 
typealias Func = (Double) -> Double
Line 1,071 ⟶ 1,185:
println(f.format(p2Tail(da7, da8)))
println(f.format(p2Tail(x, y)))
}</langsyntaxhighlight>
 
{{out}}
Line 1,084 ⟶ 1,198:
=={{header|Nim}}==
{{trans|Kotlin}}
<langsyntaxhighlight Nimlang="nim">import math, stats, strutils, sugar
 
func sqr(f: float): float = f * f
Line 1,141 ⟶ 1,255:
echo p2Tail(Da5, Da6).formatFloat(ffDecimal, 6)
echo p2Tail(Da7, Da8).formatFloat(ffDecimal, 6)
echo p2Tail(X, Y).formatFloat(ffDecimal, 6)</langsyntaxhighlight>
 
{{out}}
Line 1,152 ⟶ 1,266:
=={{header|Maple}}==
 
<langsyntaxhighlight lang="maple">WelschTTest:=proc(x::list(numeric),y::list(numeric))
uses Statistics;
local n1:=nops(x),n2:=nops(y),
Line 1,167 ⟶ 1,281:
y:=[490.2,340,433.9]:
WelschTTest(x,y);
# -9.55949772193266, 2.00085234885628, 0.0107515611497845</langsyntaxhighlight>
 
=={{header|Octave}}==
{{trans|Stata}}
<langsyntaxhighlight lang="octave">x = [3.0,4.0,1.0,2.1];
y = [490.2,340.0,433.9];
n1 = length(x);
Line 1,184 ⟶ 1,298:
ans =
 
-9.559498 2.000852 0.010752</langsyntaxhighlight>
 
=={{header|PARI/GP}}==
<langsyntaxhighlight lang="parigp">B2(x,y)=exp(lngamma(x)+lngamma(y)-lngamma(x+y))
B3(x,a,b)=a--;b--;intnum(r=0,x,r^a*(1-r)^b)
Welch2(u,v)=my(m1=vecsum(u)/#u, m2=vecsum(v)/#v, v1=var(u,m1), v2=var(v,m2), s=v1/#u+v2/#v, t=(m1-m2)/sqrt(s), nu=s^2/(v1^2/#u^2/(#u-1)+v2^2/#v^2/(#v-1))); B3(nu/(t^2+nu),nu/2,1/2)/B2(nu/2,1/2);
Welch2([3,4,1,2.1], [490.2,340,433.9])</langsyntaxhighlight>
{{out}}
<pre>%1 = 0.010751561149784496723954539777213062928</pre>
Line 1,198 ⟶ 1,312:
Uses Math::AnyNum for gamma and pi. It is possible to use some other modules (e.g. Math::Cephes) if Math::AnyNum has problematic dependencies.
{{trans|Sidef}}
<langsyntaxhighlight lang="perl">use utf8;
use List::Util qw(sum);
use Math::AnyNum qw(gamma pi);
 
sub p_value :prototype($$) {
my ($A, $B) = @_;
 
Line 1,258 ⟶ 1,372:
my ($left, $right) = splice(@tests, 0, 2);
print p_value($left, $right), "\n";
}</langsyntaxhighlight>
{{out}}
<pre>
Line 1,271 ⟶ 1,385:
We use a slightly more accurate lgamma than the C code. Note that Perl can be compiled with different underlying floating point representations -- double, long double, or quad double.
{{trans|C}}
<langsyntaxhighlight lang="perl">use strict;
use warnings;
use List::Util 'sum';
Line 1,398 ⟶ 1,512:
printf("p-value = %.14g\n",$pvalue);
}
printf("cumulative error is %g\n", $error);</langsyntaxhighlight>
{{out}}
<pre>p-value = 0.021378001462867
Line 1,413 ⟶ 1,527:
{{trans|Go}}
{{trans|Kotlin}}
<!--<langsyntaxhighlight Phixlang="phix">(phixonline)-->
<span style="color: #008080;">with</span> <span style="color: #008080;">javascript_semantics</span>
<span style="color: #008080;">function</span> <span style="color: #000000;">mean</span><span style="color: #0000FF;">(</span><span style="color: #004080;">sequence</span> <span style="color: #000000;">a</span><span style="color: #0000FF;">)</span>
Line 1,521 ⟶ 1,635:
<span style="color: #0000FF;">?</span><span style="color: #000000;">pValue</span><span style="color: #0000FF;">(</span><span style="color: #000000;">tests</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">])</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">for</span>
<!--</langsyntaxhighlight>-->
{{out}}
<pre>
Line 1,533 ⟶ 1,647:
The above was a bit off on the fifth test, so I also tried this.<br>
using gamma() from [[Gamma_function#Phix]] (the one from above is probably also fine, but I didn't test that)
<!--<langsyntaxhighlight Phixlang="phix">(phixonline)-->
<span style="color: #008080;">with</span> <span style="color: #008080;">javascript_semantics</span>
<span style="color: #000080;font-style:italic;">--&lt;copy of gamma from Gamma_function#Phix&gt;</span>
Line 1,650 ⟶ 1,764:
<span style="color: #008080;">end</span> <span style="color: #008080;">for</span>
<span style="color: #0000FF;">?{</span><span style="color: #008000;">"cumulative error"</span><span style="color: #0000FF;">,</span><span style="color: #000000;">cerr</span><span style="color: #0000FF;">}</span>
<!--</langsyntaxhighlight>-->
{{out}}
<pre>
Line 1,668 ⟶ 1,782:
 
=== Using NumPy & SciPy ===
<langsyntaxhighlight lang="python">import numpy as np
import scipy as sp
import scipy.stats
Line 1,685 ⟶ 1,799:
 
welch_ttest(np.array([3.0, 4.0, 1.0, 2.1]), np.array([490.2, 340.0, 433.9]))
(-9.559497721932658, 2.0008523488562844, 0.01075156114978449)</langsyntaxhighlight>
=== Using betain from AS 63 ===
First, the implementation of betain (translated from the Stata program in the discussion page). The original Fortran code is under copyrighted by the Royal Statistical Society. The C translation is under GPL, written by John Burkardt. The exact statement of the RSS license is unclear.
 
<langsyntaxhighlight lang="python">import math
 
def betain(x, p, q):
Line 1,739 ⟶ 1,853:
else:
temp = psq
psq += 1</langsyntaxhighlight>
 
The Python code is then straightforward:
 
<langsyntaxhighlight lang="python">import math
 
def welch_ttest(a1, a2):
Line 1,761 ⟶ 1,875:
p = betain(df / (t**2 + df), df / 2, 1 / 2)
return t, df, p</langsyntaxhighlight>
 
'''Example'''
 
<langsyntaxhighlight lang="python">a1 = [3, 4, 1, 2.1]
a2 = [490.2, 340, 433.9]
print(welch_ttest(a1, a2))</langsyntaxhighlight>
 
'''Output'''
Line 1,773 ⟶ 1,887:
 
=={{header|R}}==
<langsyntaxhighlight Rlang="r">#!/usr/bin/R
 
printf <- function(...) cat(sprintf(...))
Line 1,810 ⟶ 1,924:
results <- t.test(z1,z2, alternative="two.sided", var.equal=FALSE)
printf("%.15g\n", results$p.value);
</syntaxhighlight>
</lang>
 
The output here is used to compare against C's output above.
Line 1,827 ⟶ 1,941:
{{trans|C}}
 
<langsyntaxhighlight lang="racket">#lang racket
(require math/statistics math/special-functions)
 
Line 1,876 ⟶ 1,990:
(p-value (list 3.0 4.0 1.0 2.1)
(list 490.2 340.0 433.9))))</langsyntaxhighlight>
 
{{out}}
Line 1,890 ⟶ 2,004:
Perhaps "inspired by C example" may be more accurate. Gamma subroutine from [[Gamma_function#Raku|Gamma function task]].
 
<syntaxhighlight lang="raku" perl6line>sub Γ(\z) {
constant g = 9;
z < .5 ?? π / sin(π × z) / Γ(1 - z) !!
Line 1,955 ⟶ 2,069:
[<3.0 4.0 1.0 2.1>],
[<490.2 340.0 433.9>]
) -> @left, @right { say p-value @left, @right }</langsyntaxhighlight>
{{out}}
<pre>0.0213780014628669
Line 1,971 ⟶ 2,085:
This uses the Soper reduction formula to evaluate the integral, which converges much more quickly than Simpson's formula.
 
<syntaxhighlight lang="raku" perl6line>sub lgamma ( Num(Real) \n --> Num ){
use NativeCall;
sub lgamma (num64 --> num64) is native {}
Line 2,075 ⟶ 2,189:
$error += abs($p-value - shift @answers);
}
printf("cumulative error is %g\n", $error);</langsyntaxhighlight>
{{out}}
<pre>p-value = 0.021378001462867
Line 2,089 ⟶ 2,203:
=={{header|Ruby}}==
{{trans|Perl}}
<langsyntaxhighlight Rubylang="ruby">def calculate_p_value(array1, array2)
return 1.0 if array1.size <= 1
return 1.0 if array2.size <= 1
Line 2,214 ⟶ 2,328:
 
printf("the cumulative error is %g\n", error)
</syntaxhighlight>
</lang>
{{out}}
<pre>
Line 2,230 ⟶ 2,344:
=={{header|SAS}}==
{{trans|Stata}}
<syntaxhighlight lang="text">data tbl;
input value group @@;
cards;
Line 2,240 ⟶ 2,354:
class group;
var value;
run;</langsyntaxhighlight>
 
'''Output'''
Line 2,401 ⟶ 2,515:
Implementation in IML:
 
<langsyntaxhighlight lang="sas">proc iml;
use tbl;
read all var {value} into x where(group=1);
Line 2,414 ⟶ 2,528:
p = 2*probt(-abs(t), df);
print t df p;
quit;</langsyntaxhighlight>
 
'''Output'''
Line 2,421 ⟶ 2,535:
 
=={{header|Scala}}==
<langsyntaxhighlight Scalalang="scala">import org.apache.commons.math3.distribution.TDistribution
 
object WelchTTest extends App {
Line 2,454 ⟶ 2,568:
println(s"\nSuccessfully completed without errors. [total ${scala.compat.Platform.currentTime - executionStart} ms]")
 
}</langsyntaxhighlight>
 
=={{header|Scilab}}==
Line 2,461 ⟶ 2,575:
Scilab will print a warning because the number of degrees of freedom is not an integer. However, the underlying implementation makes use of the [http://www.netlib.org/random/ dcdflib] Fortran library, which happily accepts a noninteger df.
 
<syntaxhighlight lang="text">x = [3.0,4.0,1.0,2.1];
y = [490.2,340.0,433.9];
n1 = length(x);
Line 2,470 ⟶ 2,584:
df = (v1/n1+v2/n2)^2/(v1^2/(n1^2*(n1-1))+v2^2/(n2^2*(n2-1)));
[p, q] = cdft("PQ", -abs(t), df);
[t df 2*p]</langsyntaxhighlight>
 
'''Output'''
Line 2,480 ⟶ 2,594:
=={{header|Sidef}}==
{{trans|Raku}}
<langsyntaxhighlight lang="ruby">func p_value (A, B) {
[A.len, B.len].all { _ > 1 } || return 1
 
Line 2,532 ⟶ 2,646:
tests.each_slice(2, {|left, right|
say p_value(left, right)
})</langsyntaxhighlight>
{{out}}
<pre>
Line 2,547 ⟶ 2,661:
Notice that here we use the option '''unequal''' of the '''ttest''' command, and not '''welch''', so that Stata uses the Welch-Satterthwaite approximation.
 
<langsyntaxhighlight lang="stata">mat a=(3,4,1,2.1,490.2,340,433.9\1,1,1,1,2,2,2)'
clear
svmat double a
Line 2,574 ⟶ 2,688:
 
di r(p)
.01075156</langsyntaxhighlight>
 
The computation can easily be implemented in Mata. Here is how to compute the t statistic (t), the approximate degrees of freedom (df) and the p-value (p).
 
<langsyntaxhighlight lang="stata">st_view(a=., ., .)
x = select(a[., 1], a[., 2] :== 1)
y = select(a[., 1], a[., 2] :== 2)
Line 2,592 ⟶ 2,706:
+----------------------------------------------+
1 | -9.559497722 2.000852349 .0107515611 |
+----------------------------------------------+</langsyntaxhighlight>
 
=={{header|Tcl}}==
Line 2,602 ⟶ 2,716:
This is not particularly idiomatic Tcl, but perhaps illustrates some of the language's relationship with the Lisp family.
 
<langsyntaxhighlight Tcllang="tcl">#!/usr/bin/tclsh
 
package require math::statistics
Line 2,677 ⟶ 2,791:
puts [pValue $left $right]
}
</syntaxhighlight>
</lang>
 
{{out}}
Line 2,691 ⟶ 2,805:
{{libheader|Wren-math}}
{{libheader|Wren-fmt}}
<langsyntaxhighlight ecmascriptlang="wren">import "./math" for Math, Nums
import "./fmt" for Fmt
 
var welch = Fn.new { |a, b|
Line 2,729 ⟶ 2,843:
var g3 = Math.gamma(nu/2 + 0.5).log
var f = Fn.new { |r| r.pow(nu/2-1) / (1 - r).sqrt }
return simpson0.call(2000, nu/(t*t + nu), f) / Math.exp(g1 + g2 - g3).exp
}
 
Line 2,748 ⟶ 2,862:
Fmt.print("$0.6f", pValue.call(d5, d6))
Fmt.print("$0.6f", pValue.call(d7, d8))
Fmt.print("$0.6f", pValue.call(x, y))</langsyntaxhighlight>
 
{{out}}
Line 2,761 ⟶ 2,875:
=={{header|zkl}}==
{{trans|C}}
<langsyntaxhighlight lang="zkl">fcn calculate_Pvalue(array1,array2){
if (array1.len()<=1 or array2.len()<=1) return(1.0);
 
Line 2,811 ⟶ 2,925:
foreach x in (cof){ ser+=(x/(y+=1)); }
return((2.5066282746310005 * ser / x).log() - tmp);
}</langsyntaxhighlight>
<langsyntaxhighlight lang="zkl">testSets:=T(
T(T(27.5,21.0,19.0,23.6,17.0,17.9,16.9,20.1,21.9,22.6,23.1,19.6,19.0,21.7,21.4),
T(27.1,22.0,20.8,23.4,23.4,23.5,25.8,22.0,24.8,20.2,21.9,22.1,22.9,20.5,24.4)),
Line 2,824 ⟶ 2,938:
 
foreach x,y in (testSets)
{ println("Test set 1 p-value = %f".fmt(calculate_Pvalue(x,y))); }</langsyntaxhighlight>
{{out}}
<pre>
2,122

edits