Welch's t-test: Difference between revisions

Corrected error in equations, added to task description and corrected typo
No edit summary
(Corrected error in equations, added to task description and corrected typo)
Line 10:
 
 
Your task is to discern whether or not the difference in means between the two sets is statistically significant and worth further investigation. P-values are significance tests to gauge the probability that the difference in means between two data sets is significant, or due to chance. A threshold level, alpha, is usually chosen, 0.01 or 0.05, where p-values below alpha are worth further investigation and p-values aboutabove alpha are considered not significant. The p-value is not considered a final test of significance, [http://www.nature.com/news/scientific-method-statistical-errors-1.14700 only whether the given variable should be given further consideration].
 
This uses [[wp:Welch's_t_test|Welch's t-test]], which assumes that the variances between the two sets are not equal. Welch's t-test statistic can be computed thus:
Line 39:
The p-value, <math>p</math>, can be computed as a [[wp:Student's_t-distribution#Cumulative_distribution_function|cumulative distribution function]]
 
<math> p = 1- \frac{1}p_{2-tail} = I_{\frac{\nu}{t^2+\nu}}\left(\frac{\nu}{2}, \frac{1}{2}\right) </math>
 
where I is the incomplete beta function. This is the same as:
 
<math>pp_{2-tail} = 1-\frac{1}{2}\frac{B(\frac{\nu}{t^2+\nu};\frac{\nu}{2}, \frac{1}{2})}{B(\frac{\nu}{2}, \frac{1}{2})} </math>
 
Keeping in mind that
Line 55:
\!</math>
 
<math> pp_{2-tail} </math> can be calculated in terms of [[wp:Gamma_function|gamma functions]] and integrals more simply:
 
<math> p=1-\frac{1}p_{2-tail}\times=\frac{\int_0^\frac{\nu}{t^2+\nu} r^{\frac{\nu}{2}-1}\,(1-r)^{-0.5}\,\mathrm{d}r}{\exp((\ln(\Gamma(\frac{\nu}{2})) + \ln(\Gamma(0.5)) - \ln(\Gamma(\frac{\nu}{2}+0.5)))} </math>
 
which simplifies to
 
<math> pp_{2-tail} = 1-\frac{1}{2}\times\frac{\int_0^\frac{\nu}{t^2+\nu} \frac{r^{\frac{\nu}{2}-1}}{\sqrt{1-r}}\,\mathrm{d}r}{ \exp((\ln(\Gamma(\frac{\nu}{2})) + \ln(\Gamma(0.5)) - \ln(\Gamma(\frac{\nu}{2}+0.5))) }</math>
 
The definite integral can be approximated with [[wp:Simpson's_rule|Simpson's Rule]] but [http://rosettacode.org/wiki/Numerical_integration other methods] are also acceptable.
 
The <math>\ln(\Gamma(x))</math>, or <code>lgammal(x)</code> function is necessary for the program to work with large <code>a</code> values, as [http://rosettacode.org/wiki/Gamma_function Gamma functions] can often return values larger than can be handled by <code>double</code> or <code>long double</code> data types. The <code>lgammal(x)</code> function is standard with in <code>math.h</code> with C99 and C11 standards.
The lgamma function is necessary for the program to work with large <code>a</code> values.
 
for a 1-tail p-value. The 2-tail p-value is simply twice the 1-tail value.
=={{header|C}}==
{{works with|C99}}