Permutation test
You are encouraged to solve this task according to the task description, using any language you may know.
A new medical treatment was tested on a population of volunteers, with each volunteer randomly assigned either to a group of treatment subjects, or to a group of control subjects. Members of the treatment group were given the treatment, and members of the control group were given a placebo. The effect of the treatment or placebo on each volunteer was measured and reported in this table.
Treatment group | Control group |
---|---|
85 | 68 |
88 | 41 |
75 | 10 |
66 | 49 |
25 | 16 |
29 | 65 |
83 | 32 |
39 | 92 |
97 | 28 |
98 |
Write a program that performs a permutation test to judge whether the treatment had a significantly stronger effect than the placebo.
- Do this by considering every possible alternative assignment from the same pool of volunteers to a treatment group of size and a control group of size (i.e., the same group sizes used in the actual experiment but with the group members chosen differently), while assuming that each volunteer's effect remains constant regardless.
- Note that the number of alternatives will be the binomial coefficient .
- Compute the mean effect for each group and the difference in means between the groups in every case by subtracting the mean of the control group from the mean of the treatment group.
- Report the percentage of alternative groupings for which the difference in means is less or equal to the actual experimentally observed difference in means, and the percentage for which it is greater.
- Note that they should sum to 100%.
Extremely dissimilar values are evidence of an effect not entirely due to chance, but your program need not draw any conclusions.
You may assume the experimental data are known at compile time if that's easier than loading them at run time. Test your solution on the data given above.
Ada
<lang Ada>with Ada.Text_IO; with Iterate_Subsets;
procedure Permutation_Test is
type Group_Type is array(Positive range <>) of Positive;
Treat_Group: constant Group_Type := (85, 88, 75, 66, 25, 29, 83, 39, 97); Ctrl_Group: constant Group_Type := (68, 41, 10, 49, 16, 65, 32, 92, 28, 98);
package Iter is new Iterate_Subsets(Treat_Group'Length, Ctrl_Group'Length);
Full_Group: constant Group_Type(1 .. Iter.All_Elements) := Treat_Group & Ctrl_Group;
function Mean(S: Iter.Subset) return Float is Sum: Natural := 0; begin for I in S'Range loop Sum := Sum + Full_Group(S(I)); end loop; return Float(Sum)/Float(S'Length); end Mean;
package FIO is new Ada.Text_IO.Float_IO(Float);
T_Avg: Float := Mean(Iter.First); S_Avg: Float; S: Iter.Subset := Iter.First; Equal: Positive := 1; -- Mean(Iter'First) = Mean(Iter'First) Higher: Natural := 0; Lower: Natural := 0;
begin -- Permutation_Test;
-- first, count the subsets with a higher, an equal or a lower mean loop Iter.Next(S); S_Avg := Mean(S); if S_Avg = T_Avg then Equal := Equal + 1; elsif S_Avg >= T_Avg then Higher := Higher + 1; else Lower := Lower + 1; end if; exit when Iter.Last(S); end loop;
-- second, output the results declare use Ada.Text_IO; Sum: Float := Float(Higher + Equal + Lower); begin Put("Less or Equal: "); FIO.Put(100.0*Float(Lower+Equal) / Sum, Fore=>3, Aft=>1, Exp=>0); Put(Integer'Image(Lower+Equal)); New_Line; Put("More: "); FIO.Put(100.0*Float(Higher) / Sum, Fore=>3, Aft=>1, Exp=>0); Put(Integer'Image(Higher)); New_Line; end;
end Permutation_Test;</lang>
This solution uses an auxiliary package Iterate_Subsets. Here is the Spec: <lang Ada>generic
Subset_Size, More_Elements: Positive;
package Iterate_Subsets is
All_Elements: Positive := Subset_Size + More_Elements; subtype Index is Integer range 1 .. All_Elements; type Subset is array (1..Subset_Size) of Index;
-- iterate over all subsets of size Subset_Size -- from the set {1, 2, ..., All_Element}
function First return Subset; procedure Next(S: in out Subset); function Last(S: Subset) return Boolean;
end Iterate_Subsets; </lang>
And here is the implementation:
<lang Ada>package body Iterate_Subsets is
function First return Subset is S: Subset; begin for I in S'Range loop S(I) := I; end loop; return S; end First;
procedure Next(S: in out Subset) is I: Natural := S'Last; begin if S(I) < Index'Last then S(I) := S(I) + 1; else while S(I-1)+1 = S(I) loop I := I - 1; end loop; S(I-1) := S(I-1) + 1; for J in I .. S'Last loop S(J) := S(J-1) + 1; end loop; end if; return; end Next;
function Last(S: Subset) return Boolean is begin return S(S'First) = Index'Last-S'Length+1; end Last;
end Iterate_Subsets;</lang>
Finally, here is the output:
Less or Equal: 87.2 80551 More: 12.8 11827
BBC BASIC
<lang bbcbasic> ntreated% = 9
nplacebo% = 10 DIM results%(ntreated% + nplacebo% - 1) results%() = 85, 88, 75, 66, 25, 29, 83, 39, 97, \ REM treated group \ 68, 41, 10, 49, 16, 65, 32, 92, 28, 98 : REM placebo group greater% = 0 FOR comb% = 0 TO 2^(ntreated%+nplacebo%)-1 IF FNnbits(comb%) = ntreated% THEN tsum% = 0 : psum% = 0 FOR b% = 0 TO ntreated%+nplacebo%-1 IF comb% AND 2^b% THEN tsum% += results%(b%) ELSE psum% += results%(b%) ENDIF NEXT meandiff = tsum%/ntreated% - psum%/nplacebo% IF comb% = 2^ntreated% - 1 THEN actual = meandiff ELSE greater% -= meandiff > actual groups% += 1 ENDIF ENDIF NEXT percent = 100 * greater%/groups% PRINT "Percentage groupings <= actual experiment: "; 100 - percent PRINT "Percentage groupings > actual experiment: "; percent END DEF FNnbits(N%) N% -= N% >>> 1 AND &55555555 N% = (N% AND &33333333) + (N% >>> 2 AND &33333333) N% = (N% + (N% >>> 4)) AND &0F0F0F0F N% += N% >>> 8 : N% += N% >>> 16 = N% AND &7F</lang>
Output:
Percentage groupings <= actual experiment: 87.1970296 Percentage groupings > actual experiment: 12.8029704
C
<lang C>#include <stdio.h>
int data[] = { 85, 88, 75, 66, 25, 29, 83, 39, 97,
68, 41, 10, 49, 16, 65, 32, 92, 28, 98 };
int pick(int at, int remain, int accu, int treat) {
if (!remain) return (accu > treat) ? 1 : 0;
return pick(at - 1, remain - 1, accu + data[at - 1], treat) + ( at > remain ? pick(at - 1, remain, accu, treat) : 0 );
}
int main() {
int treat = 0, i; int le, gt; double total = 1; for (i = 0; i < 9; i++) treat += data[i]; for (i = 19; i > 10; i--) total *= i; for (i = 9; i > 0; i--) total /= i;
gt = pick(19, 9, 0, treat); le = total - gt;
printf("<= : %f%% %d\n > : %f%% %d\n", 100 * le / total, le, 100 * gt / total, gt); return 0;
}</lang>Output:<lang><= : 87.197168% 80551
> : 12.802832% 11827</lang>
Common Lisp
<lang lisp>(defun perm-test (s1 s2)
(let ((more 0) (leq 0)
(all-data (append s1 s2)) (thresh (apply #'+ s1)))
(labels ((recur (data sum need avail)
(cond ((zerop need) (if (>= sum thresh) (incf more) (incf leq))) ((>= avail need) (recur (cdr data) sum need (1- avail)) (recur (cdr data) (+ sum (car data)) (1- need) (1- avail))))))
(recur all-data 0 (length s1) (length all-data)) (cons more leq))))
(let* ((a (perm-test '(68 41 10 49 16 65 32 92 28 98) '(85 88 75 66 25 29 83 39 97)))
(x (car a)) (y (cdr a)) (s (+ x y))) (format t "<=: ~a ~6f%~% >: ~a ~6f%~%"
x (* 100e0 (/ x s)) y (* 100e0 (/ y s))))</lang>output<lang><=: 80551 87.197%
>: 11827 12.803%</lang>
D
<lang d>import std.stdio, std.algorithm, std.array, combinations3;
auto permutationTest(T)(in T[] a, in T[] b) pure nothrow @safe {
immutable tObs = a.sum; auto combs = combinations!false(a ~ b, a.length); immutable under = combs.count!(perm => perm.sum <= tObs); return under * 100.0 / combs.length;
}
void main() {
immutable treatmentGroup = [85, 88, 75, 66, 25, 29, 83, 39, 97]; immutable controlGroup = [68, 41, 10, 49, 16, 65, 32, 92, 28, 98]; immutable under = permutationTest(treatmentGroup, controlGroup); writefln("Under =%6.2f%%\nOver =%6.2f%%", under, 100.0 - under);
}</lang>
- Output:
Under = 87.20% Over = 12.80%
Alternative version:
<lang d>import std.stdio, std.algorithm, std.range;
void main() {
immutable treatment = [85, 88, 75, 66, 25, 29, 83, 39, 97]; immutable control = [68, 41, 10, 49, 16, 65, 32, 92, 28, 98]; immutable both = treatment ~ control; immutable sTreat = treatment.sum;
T pick(T)(in size_t at, in size_t remain, in T accu) pure nothrow { if (remain == 0) return accu > sTreat;
return pick(at - 1, remain - 1, accu + both[at - 1]) + (at > remain ? pick(at - 1, remain, accu) : 0); }
alias mul = reduce!q{a * b}; immutable t = mul(1.0, iota(both.length, treatment.length + 1, -1)) .reduce!q{a / b}(iota(treatment.length, 0, -1)); immutable gt = pick(both.length, treatment.length, 0); immutable le = cast(int)(t - gt); writefln(" > : %2.2f%% %d", 100.0 * gt / t, gt); writefln("<= : %2.2f%% %d", 100.0 * le / t, le);
}</lang>
- Output:
> : 12.80% 11827 <= : 87.20% 80551
GAP
<lang gap>a := [85, 88, 75, 66, 25, 29, 83, 39, 97]; b := [68, 41, 10, 49, 16, 65, 32, 92, 28, 98];
- Compute a decimal approximation of a rational
Approx := function(x, d) local neg, a, b, n, m, s; if x < 0 then x := -x; neg := true; else neg := false; fi; a := NumeratorRat(x); b := DenominatorRat(x); n := QuoInt(a, b); a := RemInt(a, b); m := 10^d; s := ""; if neg then Append(s, "-"); fi; Append(s, String(n)); n := Size(s) + 1; Append(s, String(m + QuoInt(a*m, b))); s[n] := '.'; return s; end;
PermTest := function(a, b) local c, d, p, q, u, v, m, n, k, diff, all; p := Size(a); q := Size(b); v := Concatenation(a, b); n := p + q; m := Binomial(n, p); diff := Sum(a)/p - Sum(b)/q; all := [1 .. n]; k := 0; for u in Combinations(all, p) do c := List(u, i -> v[i]); d := List(Difference(all, u), i -> v[i]); if Sum(c)/p - Sum(d)/q > diff then k := k + 1; fi; od; return [Approx((1 - k/m)*100, 3), Approx(k/m*100, 3)]; end;
- in order, % less or greater than original diff
PermTest(a, b); [ "87.197", "12.802" ]</lang>
Go
A version doing all math in integers until computing final percentages. <lang go>package main
import "fmt"
var tr = []int{85, 88, 75, 66, 25, 29, 83, 39, 97} var ct = []int{68, 41, 10, 49, 16, 65, 32, 92, 28, 98}
func main() {
// collect all results in a single list all := make([]int, len(tr)+len(ct)) copy(all, tr) copy(all[len(tr):], ct)
// compute sum of all data, useful as intermediate result var sumAll int for _, r := range all { sumAll += r }
// closure for computing scaled difference. // compute results scaled by len(tr)*len(ct). // this allows all math to be done in integers. sd := func(trc []int) int { var sumTr int for _, x := range trc { sumTr += all[x] } return sumTr*len(ct) - (sumAll-sumTr)*len(tr) }
// compute observed difference, as an intermediate result a := make([]int, len(tr)) for i, _ := range a { a[i] = i } sdObs := sd(a)
// iterate over all combinations. for each, compute (scaled) // difference and tally whether leq or gt observed difference. var nLe, nGt int comb(len(all), len(tr), func(c []int) { if sd(c) > sdObs { nGt++ } else { nLe++ } })
// print results as percentage pc := 100 / float64(nLe+nGt) fmt.Printf("differences <= observed: %f%%\n", float64(nLe)*pc) fmt.Printf("differences > observed: %f%%\n", float64(nGt)*pc)
}
// combination generator, copied from combination task func comb(n, m int, emit func([]int)) {
s := make([]int, m) last := m - 1 var rc func(int, int) rc = func(i, next int) { for j := next; j < n; j++ { s[i] = j if i == last { emit(s) } else { rc(i+1, j+1) } } return } rc(0, 0)
}</lang> Output:
differences <= observed: 87.197168% differences > observed: 12.802832%
Haskell
<lang haskell>binomial n m = (f !! n) `div` (f !! m) `div` (f !! (n - m)) where f = scanl (*) 1 [1..]
permtest treat ctrl = (fromIntegral less) / (fromIntegral total) * 100 where total = binomial (length avail) (length treat) less = combos (sum treat) (length treat) avail avail = ctrl ++ treat combos total n a@(x:xs) | total < 0 = binomial (length a) n | n == 0 = 0 | n > length a = 0 | n == length a = fromEnum (total < sum a) | otherwise = combos (total - x) (n - 1) xs + combos total n xs
main = let r = permtest [85, 88, 75, 66, 25, 29, 83, 39, 97] [68, 41, 10, 49, 16, 65, 32, 92, 28, 98] in do putStr "> : "; print r putStr "<=: "; print $ 100 - r</lang>
output:
> : 12.80283184307952 <=: 87.19716815692048
Somewhat faster, this goes from top down: <lang haskell>binomial n m = (f !! n) `div` (f !! m) `div` (f !! (n - m)) where f = scanl (*) 1 [1..]
perms treat ctrl = (less,total) where total = binomial (length ctrl + length treat) (length treat) less = length $ filter (<= sum treat) $ sums (treat ++ ctrl) (length treat) sums x n | l < n || n < 0 = [] | n == 0 = [0] | l == n = [sum x] | otherwise = [a + b | i <- [0..n], a <- sums left i, b <- sums right (n - i)] where (l, l1) = (length x, l `div` 2) (left, right) = splitAt l1 x
main = print $ (lt, 100 - lt) where (a, b) = perms [85, 88, 75, 66, 25, 29, 83, 39, 97] [68, 41, 10, 49, 16, 65, 32, 92, 28, 98] lt = (fromIntegral a) / (fromIntegral b) * 100</lang>
In cases where the sample data are a large number of relatively small positive integers, counting number of partial sums is a lot faster: <lang haskell>combs maxsum len x = foldl f [(0,0,1)] x where f a n = merge a (map (addNum n) $ filter (\(l,_,_) -> l < len) a) addNum n (a,s,c) -- anything larger than maxsum is as good as infinity | s + n > maxsum = (a+1, maxsum + 1, c) | otherwise = (a+1, s+n, c)
merge a [] = a merge [] a = a merge a@((a1,a2,a3):as) b@((b1,b2,b3):bs) | a1 == b1 && a2 == b2 = (a1,a2,a3+b3):merge as bs | a1 < b1 || (a1 == b1 && a2 < b2) = (a1,a2,a3):merge as b | otherwise = (b1,b2,b3):merge a bs
permtest a b = (lt, ge) where lt = sum $ map (\(a,b,c) -> if a == la && b < sa then c else 0) $ combs sa la (a++b) ge = (binomial (la + lb) la) - lt (sa, la, lb) = (sum a, length a, length b)
binomial n m = (f !! n) `div` (f !! m) `div` (f !! (n - m)) where f = scanl (*) 1 [1..]
-- how many combinations are less than current sum main = print$ permtest [85, 88, 75, 66, 25, 29, 83, 39, 97] [68, 41, 10, 49, 16, 65, 32, 92, 28, 98]</lang>
J
<lang j>require'stats' trmt=: 0.85 0.88 0.75 0.66 0.25 0.29 0.83 0.39 0.97 ctrl=: 0.68 0.41 0.1 0.49 0.16 0.65 0.32 0.92 0.28 0.98 difm=: -&mean result=: trmt difm ctrl all=: trmt(#@[ ({. difm }.) |:@([ (comb ~.@,"1 i.@])&# ,) { ,) ctrl smoutput 'under: ','%',~":100*mean all <: result smoutput 'over: ','%',~":100*mean all > result</lang>
Result:
<lang>under: 87.1972% over: 12.8028%</lang>
jq
Part 1: Combinations <lang jq># combination(r) generates a stream of combinations of r items from the input array. def combination(r):
if r > length or r < 0 then empty elif r == length then . else ( [.[0]] + (.[1:]|combination(r-1))), ( .[1:]|combination(r)) end;
</lang> Part 2: Permutation Test <lang jq># a should be the treatment group and b the control group: def permutationTest(a; b):
# avg(a) - avg(b) (assuming ab==a+b) def statistic(ab; a): (ab | add) as $sumab | (a | add) as $suma | ($suma / (a|length)) - (($sumab - $suma) / ((ab|length) - (a|length))); (a + b) as $ab # pooled observations | statistic($ab; a) as $t_observed # observed difference in means | reduce ($ab|combination(a|length)) as $perm # for each combination... ([0,0]; # state: [under,count] if statistic($ab; $perm) <= $t_observed then .[0] += 1 else . end | .[1] += 1 ) | .[0] * 100.0 / .[1] # under/count
- </lang>
Example: <lang jq>def treatmentGroup: [85, 88, 75, 66, 25, 29, 83, 39, 97]; def controlGroup: [68, 41, 10, 49, 16, 65, 32, 92, 28, 98];
permutationTest(treatmentGroup; controlGroup) as $under | "under=\($under)", "over=\(100 - $under)"</lang>
- Output:
$ jq -n -r -f permutation_test.jq under=87.19716815692048 over=12.802831843079517
Mathematica
<lang mathematica>"<=: " <> ToString[#1] <> " " <> ToString[100. #1/#2] <> "%\n >: " <>
ToString[#2 - #1] <> " " <> ToString[100. (1 - #1/#2)] <> "%" &[ Count[Total /@ Subsets[Join[#1, #2], {Length@#1}], n_ /; n <= Total@#1], Binomial[Length@#1 + Length@#2, Length@#1]] &[{85, 88, 75, 66, 25, 29, 83, 39, 97}, {68, 41, 10, 49, 16, 65, 32, 92, 28, 98}]</lang>
Output:
<=: 80551 87.1972% >: 11827 12.8028%
PicoLisp
<lang PicoLisp>(load "@lib/simul.l") # For 'subsets'
(scl 2)
(de _stat (A)
(let (LenA (length A) SumA (apply + A)) (- (*/ SumA LenA) (*/ (- SumAB SumA) (- LenAB LenA)) ) ) )
(de permutationTest (A B)
(let (AB (append A B) SumAB (apply + AB) LenAB (length AB) Tobs (_stat A) Count 0 ) (*/ (sum '((Perm) (inc 'Count) (and (>= Tobs (_stat Perm)) 1) ) (subsets (length A) AB) ) 100.0 Count ) ) )
(setq
*TreatmentGroup (0.85 0.88 0.75 0.66 0.25 0.29 0.83 0.39 0.97) *ControlGroup (0.68 0.41 0.10 0.49 0.16 0.65 0.32 0.92 0.28 0.98) )
(let N (permutationTest *TreatmentGroup *ControlGroup)
(prinl "under = " (round N) "%, over = " (round (- 100.0 N)) "%") )</lang>
Output:
under = 87.85%, over = 12.15%
Perl
<lang perl>#!/usr/bin/perl use warnings; use strict;
use List::Util qw{ sum };
sub means {
my @groups = @_; return map sum(@$_) / @$_, @groups;
}
sub following {
my $pattern = shift; my $orig_count = grep $_, @$pattern; my $count; do { my $i = $#{$pattern}; until (0 > $i) { $pattern->[$i] = $pattern->[$i] ? 0 : 1; last if $pattern->[$i]; --$i; } $count = grep $_, @$pattern; } until $count == $orig_count or not $count; undef @$pattern unless $count;
}
my @groups;
my $i = 0;
while () {
chomp; $i++, next if /^$/; push @{ $groups[$i] }, $_;
}
my @orig_means = means(@groups); my $orig_cmp = $orig_means[0] - $orig_means[1];
my $pattern = [ (0) x @{ $groups[0] },
(1) x @{ $groups[1] } ];
my @cmp = (0) x 3; while (@$pattern) {
my @perms = map { my $g = $_; [ (@{ $groups[0] }, @{ $groups[1] } ) [ grep $pattern->[$_] == $g, 0 .. $#{$pattern} ] ]; } 0, 1; my @means = means(@perms); $cmp[ ($means[0] - $means[1]) <=> $orig_cmp ]++;
} continue {
following($pattern);
} my $all = sum(@cmp); my $length = length $all; for (0, -1, 1) {
printf "%-7s %${length}d %6.3f%%\n", (qw(equal greater less))[$_], $cmp[$_], 100 * $cmp[$_] / $all;
}
__DATA__
85
88
75
66
25
29
83
39
97
68 41 10 49 16 65 32 92 28 98</lang>
- Output:
equal 313 0.339% less 80238 86.858% greater 11827 12.803%
Perl 6
<lang perl6>proto combine (Int, @) {*}
multi combine (0, @) { [] } multi combine ($, []) { () } multi combine ($n, [$head, *@tail]) {
gather {
take [$head, @$_] for combine($n-1, @tail); take [ @$_ ] for combine($n , @tail);
}
}
sub stats ( @test, @all ) {
(([+] @test) / +@test ) - ([+] @all, (@test X* -1)) / (@all - @test)
}
my @treated = <85 88 75 66 25 29 83 39 97>;
my @control = <68 41 10 49 16 65 32 92 28 98>;
my @all = @treated, @control;
my $base = stats( @treated, @all );
my @trials = 0, 0, 0;
map { @trials[ 1 + ( stats( $_, @all ) <=> $base ) ]++ }, combine( +@treated, @all );
say 'Counts: <, =, > ', @trials; say 'Less than : %', 100 * @trials[0] / [+] @trials; say 'Equal to : %', 100 * @trials[1] / [+] @trials; say 'Greater than : %', 100 * @trials[2] / [+] @trials; say 'Less or Equal: %', 100 * ( [+] @trials[0,1] ) / [+] @trials;</lang>
Output
Counts: <, =, > 80238 313 11827 Less than : %86.858343 Equal to : %0.338825 Greater than : %12.802832 Less or Equal: %87.197168
PureBasic
Given a treatment group with [n=9] and a control group with [m=10]. The numbers [x] from [1] to [1<<(n+m)] exhaust the possible states.
Any bit-String of Length [n+m] containing [n=9] "1's" is a Valid bit String, as tested by: IsValidBitString(x,n+m,n).
Then we can use these bits to Select whether a particular index For our array should be assigned to: the treatment group or the control group
<lang PureBasic>
Define.f meanTreated,meanControl,diffInMeans Define.f actualmeanTreated,actualmeanControl,actualdiffInMeans
Dim poolA(19)
poolA(1) =85 ; first 9 the treated poolA(2) =88 poolA(3) =75 poolA(4) =66 poolA(5) =25 poolA(6) =29 poolA(7) =83 poolA(8) =39 poolA(9) =97
poolA(10) =68 ; last 10 the control poolA(11) =41 poolA(12) =10 poolA(13) =49 poolA(14) =16 poolA(15) =65 poolA(16) =32 poolA(17) =92 poolA(18) =28 poolA(19) =98
Procedure.i IsValidBitString(x,pool,treated) Protected c,i For i=1 to pool mask=1<<(i-1) If mask&x:c+1:EndIf Next If c=treated :ProcedureReturn x Else :ProcedureReturn 0 EndIf EndProcedure
treated=9 control=10
pool =treated+control
- actual Experimentally observed difference in means
For i=1 to Treated sumTreated+poolA(i) Next For i=Treated+1 to Treated+Control sumControl+poolA(i) Next
actualmeanTreated=sumTreated /Treated actualmeanControl=sumControl /Control actualdiffInMeans=actualmeanTreated-actualmeanControl
- exhaust the possibilites
For x=1 to 1<<pool
- Valid? i.e. are there 9 "1's" ?
If IsValidBitString(x,pool,treated) TotalComBinations+1:sumTreated=0:sumControl=0
- separate the groups
For i=pool to 1 Step -1 mask=1<<(i-1):idx=pool-i+1 If mask&x sumTreated+poolA(idx) Else sumControl+poolA(idx) EndIf Next
meanTreated=sumTreated /Treated meanControl=sumControl /Control diffInMeans=meanTreated-meanControl
- gather the statistics
If (diffInMeans)<=(actualdiffInMeans) diffLessOrEqual+1 Else diffGreater+1 EndIf
EndIf Next
- show our results
- cw(StrF(100*diffLessOrEqual/TotalComBinations,2)+" "+Str(diffLessOrEqual))
- cw(StrF(100*diffGreater /TotalComBinations,2)+" "+Str(diffGreater))
Debug StrF(100*diffLessOrEqual/TotalComBinations,2)+" "+Str(diffLessOrEqual) Debug StrF(100*diffGreater /TotalComBinations,2)+" "+Str(diffGreater)
</lang>
Sample Output
87.20 80551 12.80 11827
Python
<lang python>from itertools import combinations as comb
def statistic(ab, a):
sumab, suma = sum(ab), sum(a) return ( suma / len(a) - (sumab -suma) / (len(ab) - len(a)) )
def permutationTest(a, b):
ab = a + b Tobs = statistic(ab, a) under = 0 for count, perm in enumerate(comb(ab, len(a)), 1): if statistic(ab, perm) <= Tobs: under += 1 return under * 100. / count
treatmentGroup = [85, 88, 75, 66, 25, 29, 83, 39, 97] controlGroup = [68, 41, 10, 49, 16, 65, 32, 92, 28, 98] under = permutationTest(treatmentGroup, controlGroup) print("under=%.2f%%, over=%.2f%%" % (under, 100. - under))</lang>
- Output:
under=89.11%, over=10.89%
The above solution does a different thing than the other solutions. I'm not really sure why. If you want to do the same thing as the other solutions, then this is the solution: <lang python>from itertools import combinations as comb
def permutationTest(a, b):
ab = a + b Tobs = sum(a) under = 0 for count, perm in enumerate(comb(ab, len(a)), 1): if sum(perm) <= Tobs: under += 1 return under * 100. / count
treatmentGroup = [85, 88, 75, 66, 25, 29, 83, 39, 97] controlGroup = [68, 41, 10, 49, 16, 65, 32, 92, 28, 98] under = permutationTest(treatmentGroup, controlGroup) print("under=%.2f%%, over=%.2f%%" % (under, 100. - under))</lang>
- Output:
under=87.20%, over=12.80%
REXX
This REXX program is modeled after the C version, with some generalizations and optimization. <lang rexx>/*REXX program does a permutation test on N + M subjects (volunteers):*/ /* ↑ ↑ */ /* │ │ */ /* │ └─────control population*/ /* └────────treatment population*/ n=9 data=85 88 75 66 25 29 83 39 97 68 41 10 49 16 65 32 92 28 98 w=words(data); m=w-n say 'volunteer population given treatment:' right(n,length(w)) say ' control population given a placebo:' right(m,length(w)) say say 'treatment population efficacy % (percentages):' subword(data,1,n) say ' control population placebo % (percentages):' subword(data,n+1) say
do v= 0 for w ; #.v=word(data,v+1) ; end
treat=0; do i= 0 to n-1 ; treat=treat+#.i ; end total=1; do j=19 to m+1 by -1 ; total=total*j ; end
do k= 9 to 1 by -1 ; total=total/k ; end
gt=pick(n+m, n, 0) le=total-gt say "<= " format(100*le/total,,3)'%' le /*show 3 decimal places.*/ say " > " format(100*gt/total,,3)'%' gt exit /*stick a fork in it, we're done.*/ /*──────────────────────────────────PICK subroutine─────────────────────*/ pick: procedure expose #. treat; parse arg it,rest,eff if rest==0 then return eff>treat if it>rest then q=pick(it-1, rest, eff)
else q=0
itP=it-1 return pick(itP, rest-1, eff+#.itP) + q</lang> output using the default input
volunteer population given treatment: 9 control population given a placebo: 10 treatment population efficacy % (percentages): 85 88 75 66 25 29 83 39 97 control population placebo % (percentages): 68 41 10 49 16 65 32 92 28 98 <= 87.197% 80551 > 12.803% 11827
R
<lang r>permutation.test <- function(treatment, control) {
perms <- combinations(length(treatment)+length(control), length(treatment), c(treatment, control), set=FALSE) p <- mean(rowMeans(perms) <= mean(treatment)) c(under=p, over=(1-p))
}</lang>
<lang r>> permutation.test(c(85, 88, 75, 66, 25, 29, 83, 39, 97), + c(68, 41, 10, 49, 16, 65, 32, 92, 28, 98))
under over
0.8719717 0.1280283 </lang>
Racket
<lang Racket>#lang racket/base
(define-syntax-rule (inc! x)
(set! x (add1 x)))
(define (permutation-test control-gr treatment-gr)
(let ([both-gr (append control-gr treatment-gr)] [threshold (apply + control-gr)] [more 0] [leq 0]) (let loop ([data both-gr] [sum 0] [needed (length control-gr)] [available (length both-gr)]) (cond [(zero? needed) (if (>= sum threshold) (inc! more) (inc! leq))] [(>= available needed) (loop (cdr data) sum needed (sub1 available)) (loop (cdr data) (+ sum (car data)) (sub1 needed) (sub1 available))] [else (void)])) (values more leq)))
(let-values ([(more leq) (permutation-test '(68 41 10 49 16 65 32 92 28 98)
'(85 88 75 66 25 29 83 39 97))]) (let ([sum (+ more leq)]) (printf "<=: ~a ~a%~n>: ~a ~a%~n" more (real->decimal-string (* 100. (/ more sum)) 2) leq (real->decimal-string (* 100. (/ leq sum)) 2))))
</lang>
Sample Output
<=: 80551 87.20% >: 11827 12.80%
Ruby
<lang ruby>def statistic(ab, a)
sumab, suma = ab.inject(:+).to_f, a.inject(:+).to_f suma / a.size - (sumab - suma) / (ab.size - a.size)
end
def permutationTest(a, b)
ab = a + b tobs = statistic(ab, a) under = count = 0 ab.combination(a.size) do |perm| under += 1 if statistic(ab, perm) <= tobs count += 1 end under * 100.0 / count
end
treatmentGroup = [85, 88, 75, 66, 25, 29, 83, 39, 97] controlGroup = [68, 41, 10, 49, 16, 65, 32, 92, 28, 98] under = permutationTest(treatmentGroup, controlGroup) puts "under=%.2f%%, over=%.2f%%" % [under, 100 - under]</lang>
- Output:
under=87.20%, over=12.80%
Seed7
<lang seed7>$ include "seed7_05.s7i";
include "float.s7i";
const array integer: treatmentGroup is [] (85, 88, 75, 66, 25, 29, 83, 39, 97); const array integer: controlGroup is [] (68, 41, 10, 49, 16, 65, 32, 92, 28, 98); const array integer: both is treatmentGroup & controlGroup;
const func integer: pick (in integer: at, in integer: remain, in integer: accu, in integer: treat) is func
result var integer: picked is 0; begin if remain = 0 then picked := ord(accu > treat); else picked := pick(at - 1, remain - 1, accu + both[at], treat); if at > remain then picked +:= pick(at - 1, remain, accu, treat); end if; end if; end func;
const proc: main is func
local var integer: experimentalResult is 0; var integer: treat is 0; var integer: total is 1; var integer: le is 0; var integer: gt is 0; var integer: i is 0; begin for experimentalResult range treatmentGroup do treat +:= experimentalResult; end for; total := 19 ! 10; # Binomial coefficient gt := pick(19, 9, 0, treat); le := total - gt; writeln("<= : " <& 100.0 * flt(le) / flt(total) digits 6 <& "% " <& le); writeln(" > : " <& 100.0 * flt(gt) / flt(total) digits 6 <& "% " <& gt); end func;</lang>
- Output:
<= : 87.197168% 80551 > : 12.802832% 11827
Tcl
<lang tcl>package require Tcl 8.5
- Difference of means; note that the first list must be the concatenation of
- the two lists (because this is cheaper to work with).
proc statistic {AB A} {
set sumAB [tcl::mathop::+ {*}$AB] set sumA [tcl::mathop::+ {*}$A] expr {
$sumA / double([llength $A]) - ($sumAB - $sumA) / double([llength $AB] - [llength $A])
}
}
- Selects all k-sized combinations from a list.
proc selectCombinationsFrom {k l} {
if {$k == 0} {return {}} elseif {$k == [llength $l]} {return [list $l]} set all {} set n [expr {[llength $l] - [incr k -1]}] for {set i 0} {$i < $n} {} { set first [lindex $l $i]
incr i
if {$k == 0} { lappend all $first
} else { foreach s [selectCombinationsFrom $k [lrange $l $i end]] { lappend all [list $first {*}$s] }
} } return $all
}
- Compute the permutation test value and its complement.
proc permutationTest {A B} {
set whole [concat $A $B] set Tobs [statistic $whole $A] set undercount 0 set overcount 0 set count 0 foreach perm [selectCombinationsFrom [llength $A] $whole] {
set t [statistic $whole $perm] incr count if {$t <= $Tobs} {incr undercount} else {incr overcount}
} set count [tcl::mathfunc::double $count] list [expr {$overcount / $count}] [expr {$undercount / $count}]
}</lang> Demonstration code: <lang tcl>set treatmentGroup {0.85 0.88 0.75 0.66 0.25 0.29 0.83 0.39 0.97} set controlGroup {0.68 0.41 0.10 0.49 0.16 0.65 0.32 0.92 0.28 0.98} lassign [permutationTest $treatmentGroup $controlGroup] over under puts [format "under=%.2f%%, over=%.2f%%" [expr {$under*100}] [expr {$over*100}]]</lang> Output:
under=86.90%, over=13.10%
Ursala
<lang Ursala>#import std
- import nat
- import flo
treatment_group = <85,88,75,66,25,29,83,39,97> control_group = <68,41,10,49,16,65,32,92,28,98>
f = # returns the fractions of alternative mean differences above and below the actual
float~*; -+
vid^~G(plus,~&)+ (not fleq@rlX)*|@htX; ~~ float+ length, minus*+ mean^~*C/~& ^DrlrjXS(~&l,choices)^/-- length@l+-
- show+
t = --* *-'%'@lrNCC printf/$'%0.2f' times/$100. f(treatment_group,control_group)</lang> output:
12.80% 87.20%
zkl
A solution that is not going to scale gracefully at all.
<lang zkl>fcn permutationTest(a,b){
ab := a.extend(b); tObs := a.sum(0); combs := Utils.Helpers.pickNFrom(a.len(),ab); // 92,378 under := combs.reduce('wrap(sum,perm){ sum+(perm.sum(0) <= tObs) },0); 100.0 * under / combs.len();
}
treatmentGroup := T(85, 88, 75, 66, 25, 29, 83, 39, 97); controlGroup := T(68, 41, 10, 49, 16, 65, 32, 92, 28, 98); under := permutationTest(treatmentGroup, controlGroup); println("Under =%6.2f%%\nOver =%6.2f%%".fmt(under, 100.0 - under));</lang>
- Output:
Under = 87.20% Over = 12.80%