Averages/Median

From Rosetta Code
Revision as of 16:53, 8 March 2010 by 84.165.241.111 (talk)
Task
Averages/Median
You are encouraged to solve this task according to the task description, using any language you may know.

Write a program to find the median value of a vector of floating-point numbers. The program need not handle the case where the vector is empty, but must handle the case where there are an even number of elements.

There are several approaches to this. One is to sort the elements, and then pick the one in the middle. Sorting would take at least O(n logn). Another would be to build a priority queue from the elements, and then extract half of the elements to get to the middle one(s). This would also take O(n logn). The best solution is to use the selection algorithm to find the median in O(n) time.

See also: Mean, Mode

AppleScript

<lang AppleScript>set alist to {1,2,3,4,5,6,7,8} set med to medi(alist)

on medi(alist)

set temp to {} set lcount to count every item of alist if lcount is equal to 2 then return (item (random number from 1 to 2) of alist) else if lcount is less than 2 then return item 1 of alist else --if lcount is greater than 2 set min to findmin(alist) set max to findmax(alist) repeat with x from 1 to lcount if x is not equal to min and x is not equal to max then set end of temp to item x of alist end repeat set med to medi(temp) end if return med

end medi

on findmin(alist)

set min to 1 set alength to count every item of alist repeat with x from 1 to alength if item x of alist is less than item min of alist then set min to x end repeat return min

end findmin

on findmax(alist)

set max to 1 set alength to count every item of alist repeat with x from 1 to alength if item x of alist is greater than item max of alist then set max to x end repeat return max

end findmax</lang>

AutoHotkey

Takes the lower of the middle two if length is even <lang AutoHotkey>seq = 4.1, 7.2, 1.7, 9.3, 4.4, 3.2, 5 MsgBox % median(seq, "`,")  ; 4.1

median(seq, delimiter) {

 Sort, seq, ND%delimiter%
 StringSplit, seq, seq, % delimiter
 median := Floor(seq0 / 2)
 Return seq%median%

}</lang>

C

<lang C>#include <stdio.h>

  1. include <stdlib.h>

typedef struct floatList {

   float *list;
   int   size;

} *FloatList;

int floatcmp( const void *a, const void *b) {

   if  (*(float *)a > *(float *)b)  return 1;
   else if (*(float *)a < *(float *)b) return -1;
   else return 0;

}

float median( FloatList fl ) {

   qsort( fl->list, fl->size, sizeof(float), floatcmp);
   return 0.5 * ( fl->list[fl->size/2] + fl->list[(fl->size-1)/2]);

}

int main() {

   static float floats1[] = { 5.1, 2.6, 6.2, 8.8, 4.6, 4.1 };
   static struct floatList flist1 = { floats1, sizeof(floats1)/sizeof(float) };
   static float floats2[] = { 5.1, 2.6, 8.8, 4.6, 4.1 };
   static struct floatList flist2 = { floats2, sizeof(floats2)/sizeof(float) };
   printf("flist1 median is %7.2f\n", median(&flist1)); /* 4.85 */
   printf("flist2 median is %7.2f\n", median(&flist2)); /* 4.60 */
   return 0;

}</lang>

C++

This function runs in linear time on average. <lang cpp>#include <algorithm>

// inputs must be random-access iterators of doubles // Note: this function modifies the input range template <typename Iterator> double median(Iterator begin, Iterator end) {

 // this is middle for odd-length, and "upper-middle" for even length
 Iterator middle = begin + (end - begin) / 2;
 // This function runs in O(n) on average, according to the standard
 std::nth_element(begin, middle, end);
 if ((end - begin) % 2 != 0) { // odd length
   return *middle;
 } else { // even length
   // the "lower middle" is the max of the lower half
   Iterator lower_middle = std::max_element(begin, middle);
   return (*middle + *lower_middle) / 2.0;
 }

}

  1. include <iostream>

int main() {

 double a[] = {4.1, 5.6, 7.2, 1.7, 9.3, 4.4, 3.2};
 double b[] = {4.1, 7.2, 1.7, 9.3, 4.4, 3.2};
 std::cout << median(a+0, a + sizeof(a)/sizeof(a[0])) << std::endl; // 4.4
 std::cout << median(b+0, b + sizeof(b)/sizeof(b[0])) << std::endl; // 4.25
 return 0;

}</lang>

C#

<lang csharp>using System; using System.Collections; using System.Collections.Generic; using System.Linq; using System.Linq.Parallel; namespace Test {

   class Program {
       static void Main(string[] args) {
           /*
            * We Use Linq To Determine The Median
            * It could be done ofcourse the Normal way
            */
           List<double> myList = new List<double>() { 1, 5, 3, 6, 4, 2 }; 
           var query =     from numbers in myList //select the numbers
                           orderby numbers ascending
                           select numbers;
           if (myList.Count % 2 == 0) { //we know its even
               int element = myList.Count / 2; ;
               double median = (double)((query.ElementAt(element - 1) + query.ElementAt(element))/2);
               Console.WriteLine(median);
           } else {
              //we know its odd
               double element = (double)myList.Count / 2;
               element = Math.Round(element, MidpointRounding.AwayFromZero);
               double median = (double)query.ElementAt((int)(element - 1));
               Console.WriteLine(median);
           }
           Console.ReadLine();
       }
   }

}</lang>

Clojure

Simple: <lang lisp>(defn median [ns]

 (let [ns (sort ns)
       cnt (count ns)
       mid (bit-shift-right cnt 1)]
   (if (odd? cnt)
     (nth ns mid)
     (/ (+ (nth ns mid) (nth ns (dec mid))) 2))))</lang>

Common Lisp

The recursive partitioning solution, without the median of medians optimization.

<lang lisp>((defun select-nth (n list predicate)

 "Select nth element in list, ordered by predicate, modifying list."
 (do ((pivot (pop list))
      (ln 0) (left '())
      (rn 0) (right '()))
     ((endp list)
      (cond
       ((< n ln) (select-nth n left predicate))
       ((eql n ln) pivot)
       ((< n (+ ln rn 1)) (select-nth (- n ln 1) right predicate))
       (t (error "n out of range."))))
   (if (funcall predicate (first list) pivot)
     (psetf list (cdr list)
            (cdr list) left
            left list
            ln (1+ ln))
     (psetf list (cdr list)
            (cdr list) right
            right list
            rn (1+ rn)))))

(defun median (list predicate)

 (select-nth (floor (length list) 2) list predicate))</lang>

D

<lang d>import std.stdio;

float getMedian(float[]nums) {

       // in-place sort
       nums.sort;
       if (nums.length&1) return nums[nums.length/2];
       return (nums[nums.length/2 - 1] + nums[nums.length/2])/2;

}

int main() {

       float[]f1 = [ 5.1, 2.6, 6.2, 8.8, 4.6, 4.1 ];
       float[]f2 = [ 5.1, 2.6, 8.8, 4.6, 4.1 ];
       writefln("Even median: %f",getMedian(f1));
       writefln("Odd median: %f",getMedian(f2));
       return 0;

}</lang>

E

TODO: Use the selection algorithm, whatever that is

<lang e>def median(list) {

   def sorted := list.sort()
   def count := sorted.size()
   def mid1 := count // 2
   def mid2 := (count - 1) // 2
   if (mid1 == mid2) {          # avoid inexact division
       return sorted[mid1]
   } else {
       return (sorted[mid1] + sorted[mid2]) / 2
   }

}</lang>

<lang e>? median([1,9,2])

  1. value: 2

? median([1,9,2,4])

  1. value: 3.0</lang>

Forth

This uses the O(n) algorithm derived from quicksort. <lang forth>-1 cells constant -cell

cell- -cell + ;

defer lessthan ( a@ b@ -- ? ) ' < is lessthan

mid ( l r -- mid ) over - 2/ -cell and + ;
exch ( addr1 addr2 -- ) dup @ >r over @ swap ! r> swap ! ;
part ( l r -- l r r2 l2 )
 2dup mid @ >r ( r: pivot )
 2dup begin
   swap begin dup @  r@ lessthan while cell+ repeat
   swap begin r@ over @ lessthan while cell- repeat
   2dup <= if 2dup exch >r cell+ r> cell- then
 2dup > until  r> drop ;

0 value midpoint

select ( l r -- )
 begin 2dup < while
   part
   dup  midpoint >= if nip nip ( l l2 ) else
   over midpoint <= if drop rot drop swap ( r2 r ) else
   2drop 2drop exit then then
 repeat 2drop ;
 
median ( array len -- m )
 1- cells over +  2dup mid to midpoint
 select           midpoint @ ;</lang>

<lang forth>create test 4 , 2 , 1 , 3 , 5 ,

test 4 median . \ 2 test 5 median . \ 3</lang>

Fortran

Works with: Fortran version 90 and later

<lang fortran>program Median_Test

 real            :: a(7) = (/ 4.1, 5.6, 7.2, 1.7, 9.3, 4.4, 3.2 /), &
                    b(6) = (/ 4.1, 7.2, 1.7, 9.3, 4.4, 3.2 /)
 print *, median(a)
 print *, median(b)

contains

 function median(a, found)
   real, dimension(:), intent(in) :: a
     ! the optional found argument can be used to check
     ! if the function returned a valid value; we need this
     ! just if we suspect our "vector" can be "empty"
   logical, optional, intent(out) :: found
   real :: median
   integer :: l
   real, dimension(size(a,1)) :: ac
   if ( size(a,1) < 1 ) then
      if ( present(found) ) found = .false.
   else
      ac = a
      ! this is not an intrinsic: peek a sort algo from
      ! Category:Sorting, fixing it to work with real if
      ! it uses integer instead.
      call sort(ac)
      l = size(a,1)
      if ( mod(l, 2) == 0 ) then
         median = (ac(l/2+1) + ac(l/2))/2.0
      else
         median = ac(l/2+1)
      end if
      if ( present(found) ) found = .true.
   end if
 end function median

end program Median_Test</lang>

Haskell

Library: [[:Category:hstats|hstats]][[Category:hstats]]

<lang haskell>> Math.Statistics.median [1,9,2,4] 3.0</lang>

And an implementation:

<lang haskell>median xs | even len = (mean . take 2 . drop (mid - 1)) ordered

         | otherwise = ordered !! mid
 where len = length xs
       mid = len `div` 2
       ordered = sort xs</lang>

TODO: use a better algorithm than sorting

J

The verb median is available from the stats/base addon and returns the mean of the two middle values for an even number of elements: <lang j> require 'stats/base'

 median 1 9 2 4

3</lang> The definition given in the addon script is: <lang j>midpt=: -:@<:@# median=: -:@(+/)@((<. , >.)@midpt { /:~)</lang>

If, for an even number of elements, both values were desired when those two values are distinct, then the following implementation would suffice: <lang j> median=: ~.@(<. , >.)@midpt { /:~

  median 1 9 2 4

2 4</lang>

Java

Works with: Java version 1.5+

Sorting: <lang java5>// Note: this function modifies the input list public static double median(List<Double> list){

  Collections.sort(list);
  return (list.get(list.size() / 2) + list.get((list.size() - 1) / 2)) / 2;

}</lang>

Works with: Java version 1.5+

Using priority queue (which sorts under the hood): <lang java5>public static double median2(List<Double> list){

  PriorityQueue<Double> pq = new PriorityQueue<Double>(list);
  int n = list.size();
  for (int i = 0; i < (n-1)/2; i++)
     pq.poll(); // discard first half
  if (n % 2 != 0) // odd length
     return pq.poll();
  else
     return (pq.poll() + pq.poll()) / 2.0;

}</lang>

JavaScript

<lang javascript>function median(ary) {

   if (ary.length == 0)
       return null;
   ary.sort(function (a,b){return a - b})
   var mid = Math.floor(ary.length / 2);
   if ((ary.length % 2) == 1)  // length is odd
       return ary[mid];
   else 
       return (ary[mid - 1] + ary[mid]) / 2;

}

median([]); // null median([5,3,4]); // 4 median([5,4,2,3]); // 3.5 median([3,4,1,-8.4,7.2,4,1,1.2]); // 2.1</lang>

Lua

<lang lua>function median (numlist)

   if type(numlist) ~= 'table' then return numlist end
   table.sort(numlist)
   if #numlist %2 == 0 then return (numlist[#numlist/2] + numlist[#numlist/2+1]) / 2 end
   return numlist[math.ceil(#numlist/2)]

end

print(median({4.1, 5.6, 7.2, 1.7, 9.3, 4.4, 3.2})) print(median({4.1, 7.2, 1.7, 9.3, 4.4, 3.2}))</lang>

Mathematica

Built-in function: <lang Mathematica>Median[{1, 5, 3, 2, 4}] Median[{1, 5, 3, 6, 4, 2}]</lang> gives back: <lang Mathematica>3 7/2</lang> Custom function: <lang Mathematica>mymedian[x_List]:=Module[{t=Sort[x],L=Length[x]},

If[Mod[L,2]==0,
 (tL/2+tL/2+1)/2
,
 t(L+1)/2
]

]</lang> Example of custom function: <lang Mathematica>mymedian[{1, 5, 3, 2, 4}] mymedian[{1, 5, 3, 6, 4, 2}]</lang> gives back: <lang Mathematica>3 7/2</lang>

MATLAB

If the input has an even number of elements, function returns the mean of the middle two values: <lang Matlab>function [medianValue] = findmedian(setOfValues)

  medianValue = median(setOfValues);</lang>

OCaml

<lang ocaml>(* note: this modifies the input array *) let median array =

 let len = Array.length array in
   Array.sort compare array;
   (array.((len-1)/2) +. array.(len/2)) /. 2.0;;

let a = [|4.1; 5.6; 7.2; 1.7; 9.3; 4.4; 3.2|];; median a;; let a = [|4.1; 7.2; 1.7; 9.3; 4.4; 3.2|];; median a;;</lang>

Octave

Of course Octave has its own median function we can use to check our implementation. The Octave's median function, however, does not handle the case you pass in a void vector. <lang octave>function y = median2(v)

 if (numel(v) < 1)
   y = NA;
 else
   sv = sort(v);
   l = numel(v);
   if ( mod(l, 2) == 0 )
     y = (sv(floor(l/2)+1) + sv(floor(l/2)))/2;
   else
     y = sv(floor(l/2)+1);
   endif
 endif

endfunction

a = [4.1, 5.6, 7.2, 1.7, 9.3, 4.4, 3.2]; b = [4.1, 7.2, 1.7, 9.3, 4.4, 3.2];

disp(median2(a))  % 4.4 disp(median(a)) disp(median2(b))  % 4.25 disp(median(b))</lang>

Oz

<lang oz>declare

 fun {Median Xs}
    Len = {Length Xs}
    Mid = Len div 2 + 1 %% 1-based index
    Sorted = {Sort Xs Value.'<'}
 in
    if {IsOdd Len} then {Nth Sorted Mid}
    else ({Nth Sorted Mid} + {Nth Sorted Mid-1}) / 2.0
    end
 end

in

 {Show {Median [4.1 5.6 7.2 1.7 9.3 4.4 3.2]}}
 {Show {Median [4.1 7.2 1.7 9.3 4.4 3.2]}}</lang>

Perl

Translation of: Python

<lang perl>sub median

{my @a = sort @_;
 return ($a[$#a/2] + $a[@a/2]) / 2;}</lang>

Perl 6

Works with: Rakudo version #22 "Thousand Oaks"

<lang perl6>sub median {

 my @a = sort @_;
 return (@a[@a.end / 2] + @a[@a / 2]) / 2;

}</lang>

PL/I

<lang PL/I> call sort(A); n = dimension(A,1); if iand(n,1) = 1 then /* an odd number of elements */

  median = A(n/2);

else /* an even number of elements */

  median = (a(n/2) + a(trunc(n/2)+1) )/2;

</lang>

PureBasic

<lang PureBasic> Procedure.d median(Array values.d(1), length.i)

 If length = 0 : ProcedureReturn 0.0 : EndIf
 SortArray(values(), #PB_Sort_Ascending)
 If length % 2
   ProcedureReturn values(length / 2)
 EndIf
 ProcedureReturn 0.5 * (values(length / 2 - 1) + values(length / 2))

EndProcedure

Procedure.i readArray(Array values.d(1))

 Protected length.i, i.i
 Read.i length
 ReDim values(length - 1)
 For i = 0 To length - 1
   Read.d values(i)
 Next
 ProcedureReturn i

EndProcedure

Dim floats.d(0) Restore array1 length.i = readArray(floats()) Debug median(floats(), length) Restore array2 length.i = readArray(floats()) Debug median(floats(), length)

DataSection

 array1:
   Data.i 7
   Data.d 4.1, 5.6, 7.2, 1.7, 9.3, 4.4, 3.2
 array2:
   Data.i 6
   Data.d 4.1, 7.2, 1.7, 9.3, 4.4, 3.2

EndDataSection </lang>

Python

<lang python>def median(aray):

   srtd = sorted(aray)
   alen = len(srtd)
   return 0.5*( srtd[(alen-1)//2] + srtd[alen//2])

a = (4.1, 5.6, 7.2, 1.7, 9.3, 4.4, 3.2) print a, median(a) a = (4.1, 7.2, 1.7, 9.3, 4.4, 3.2) print a, median(a)</lang>

R

Translation of: Octave

<lang R>omedian <- function(v) {

 if ( length(v) < 1 )
   NA
 else {
   sv <- sort(v)
   l <- length(sv)
   if ( l %% 2 == 0 )
     (sv[floor(l/2)+1] + sv[floor(l/2)])/2
   else
     sv[floor(l/2)+1]
 }

}

a <- c(4.1, 5.6, 7.2, 1.7, 9.3, 4.4, 3.2) b <- c(4.1, 7.2, 1.7, 9.3, 4.4, 3.2)

print(median(a)) # 4.4 print(omedian(a)) print(median(b)) # 4.25 print(omedian(b))</lang>


REBOL

<lang rebol> median: func [

   "Returns the midpoint value in a series of numbers; half the values are above, half are below."
   block [any-block!]
   /local len mid

][

   if empty? block [return none]
   block: sort copy block
   len: length? block
   mid: to integer! len / 2
   either odd? len [
       pick block add 1 mid
   ][
       (block/:mid) + (pick block add 1 mid) / 2
   ]

] </lang>

Ruby

<lang ruby>def median(ary)

 return nil if ary.empty?
 mid, rem = ary.length.divmod(2)
 if rem == 0
   ary.sort[mid-1,2].inject(:+) / 2.0
 else
   ary.sort[mid]
 end

end

p median([]) # => nil p median([5,3,4]) # => 4 p median([5,4,2,3]) # => 3.5 p median([3,4,1,-8.4,7.2,4,1,1.2]) # => 2.1</lang>

Alternately: <lang ruby>def median(aray)

   srtd = aray.sort
   alen = srtd.length
   (srtd[(alen-1)/2] + srtd[alen/2]) / 2.0

end</lang>

Scala

Works with: Scala version 2.8

(See the Scala discussion on Mean for more information.)

<lang scala>def median[T](s: Seq[T])(implicit n: Fractional[T]) = {

 import n._
 val (lower, upper) = s.sortWith(_<_).splitAt(s.size / 2)
 if (s.size % 2 == 0) (lower.last + upper.head) / fromInt(2) else upper.head

}</lang>

This isn't really optimal. The methods splitAt and last are O(n/2) on many sequences, and then there's the lower bound imposed by the sort. Finally, we call size two times, and it can be O(n).

Scheme

Translation of: Python

Using Rosetta Code's bubble-sort function <lang Scheme>(define (median l)

 (* (+ (list-ref (bubble-sort l >) (round (/ (- (length l) 1) 2)))
       (list-ref (bubble-sort l >) (round (/ (length l) 2)))) 0.5))</lang>

Using SRFI-95: <lang Scheme>(define (median l)

 (* (+ (list-ref (sort l less?) (round (/ (- (length l) 1) 2)))
       (list-ref (sort l less?) (round (/ (length l) 2)))) 0.5))</lang>

Slate

<lang slate>s@(Sequence traits) median [

 s isEmpty
   ifTrue: [Nil]
   ifFalse:
     [| sorted |
      sorted: s sort.
      sorted length `cache isEven
        ifTrue: [(sorted middle + (sorted at: sorted indexMiddle - 1)) / 2]
        ifFalse: [sorted middle]]

].</lang>

<lang slate>inform: { 4.1 . 5.6 . 7.2 . 1.7 . 9.3 . 4.4 . 3.2 } median. inform: { 4.1 . 7.2 . 1.7 . 9.3 . 4.4 . 3.2 } median.</lang>

Smalltalk

Works with: GNU Smalltalk

<lang smalltalk>OrderedCollection extend [

   median [
     self size = 0
       ifFalse: [ |s l|
         l := self size.
         s := self asSortedCollection.

(l rem: 2) = 0 ifTrue: [ ^ ((s at: (l//2 + 1)) + (s at: (l//2))) / 2 ] ifFalse: [ ^ s at: (l//2 + 1) ] ] ifTrue: [ ^nil ]

   ]

].</lang>

<lang smalltalk>{ 4.1 . 5.6 . 7.2 . 1.7 . 9.3 . 4.4 . 3.2 } asOrderedCollection

  median displayNl.

{ 4.1 . 7.2 . 1.7 . 9.3 . 4.4 . 3.2 } asOrderedCollection

  median displayNl.</lang>

Tcl

<lang tcl>proc median args {

   set list [lsort -real $args]
   set len [llength $list]
   # Odd number of elements
   if {$len & 1} {
       return [lindex $list [expr {($len-1)/2}]]
   }
   # Even number of elements
   set idx2 [expr {$len/2}]
   set idx1 [expr {$idx2-1}]
   return [expr {
       ([lindex $list $idx1] + [lindex $list $idx2])/2.0
   }]

}

puts [median 3.0 4.0 1.0 -8.4 7.2 4.0 1.0 1.2]; # --> 2.1</lang>

TI-89 BASIC

<lang ti89b>median({3, 4, 1, -8.4, 7.2, 4, 1, 1})</lang>

Ursala

the simple way (sort first and then look in the middle) <lang Ursala>#import std

  1. import flo

median = fleq-<; @K30K31X eql?\~&rh div\2.+ plus@lzPrhPX</lang> test program, once with an odd length and once with an even length vector <lang Ursala>#cast %eW

examples =

median~~ (

  <9.3,-2.0,4.0,7.3,8.1,4.1,-6.3,4.2,-1.0,-8.4>,
  <8.3,-3.6,5.7,2.3,9.3,5.4,-2.3,6.3,9.9>)</lang>

output:

(4.050000e+00,5.700000e+00)

Vedit macro language

This is a simple implementation for positive integers using sorting. The data is stored in current edit buffer in ascii representation. The values must be right justified.

The result is returned in text register @10. In case of even number of items, the lower middle value is returned.

<lang vedit>Sort(0, File_Size, NOCOLLATE+NORESTORE) EOF Goto_Line(Cur_Line/2) Reg_Copy(10, 1)</lang>