Welch's t-test: Difference between revisions

no edit summary
(added reference, defined variables, still more to do on task description)
No edit summary
Line 3:
Given two lists of data, calculate the [[wp:p-value|p-Value]] used for null hypothesis testing.
 
'''Task Description'''<br>
P-values are significance tests to gauge the probability that the difference in means between two data sets is significant, or due to chance. A threshold level, alpha, is usually chosen, 0.01 or 0.05, where p-values below alpha are worth further investigation and p-values about alpha are considered not significant. The p-value is not considered a final test of significance, [http://www.nature.com/news/scientific-method-statistical-errors-1.14700 only whether the given variable should be given further consideration].
 
Given two sets of data, calculate the p-value:
x = {3.0,4.0,1.0,2.1}
y = {490.2,340.0,433.9}
 
 
Your task is to discern whether or not the difference in means between the two sets is statistically significant and worth further investigation. P-values are significance tests to gauge the probability that the difference in means between two data sets is significant, or due to chance. A threshold level, alpha, is usually chosen, 0.01 or 0.05, where p-values below alpha are worth further investigation and p-values about alpha are considered not significant. The p-value is not considered a final test of significance, [http://www.nature.com/news/scientific-method-statistical-errors-1.14700 only whether the given variable should be given further consideration].
 
This uses [[wp:Welch's_t_test|Welch's t-test]], which assumes that the variances between the two sets are not equal. Welch's t-test statistic can be computed thus:
Line 55 ⟶ 62:
 
<math> p = 1-\frac{1}{2}\times\frac{\int_0^\frac{\nu}{t^2+\nu} \frac{r^{\frac{\nu}{2}-1}}{\sqrt{1-r}}\,\mathrm{d}r}{ \exp((\ln(\Gamma(\frac{\nu}{2})) + \ln(\Gamma(0.5)) - \ln(\Gamma(\frac{\nu}{2}+0.5))) }</math>
 
The definite integral can be approximated with [[wp:Simpson's_rule|Simpson's Rule]] but other methods are also acceptable.
 
The lgamma function is necessary for the program to work with large <code>a</code> values.
 
for a 1-tail p-value. The 2-tail p-value is simply twice the 1-tail value.