Sort stability: Difference between revisions

From Rosetta Code
Content added Content deleted
(→‎{{header|Ruby}}: documenting difficulties)
m (→‎{{header|Ruby}}: link to stable merge sort)
Line 57: Line 57:
# => [["US", "Birmingham"], ["UK", "Birmingham"], ["UK", "London"], ["US", "New York"]]</lang>
# => [["US", "Birmingham"], ["UK", "Birmingham"], ["UK", "London"], ["US", "New York"]]</lang>


So, if we want to use Ruby's <code>sort</code> ''and'' we want stability, we might be better off using a custom Schwartzian transform:
So, if we want to use Ruby's <code>sort</code> ''and'' we want stability, we might be better off using [[Merge sort]] or a Schwartzian transform:
<lang ruby>
<lang ruby>ary.each_index \
ary.each_index \
.map {|i| ary[i] + [i]} \
.map {|i| ary[i] + [i]} \
.sort {|a,b| (a[1] <=> b[1]).nonzero? or a[-1] <=> b[-1]} \
.sort {|a,b| (a[1] <=> b[1]).nonzero? or a[-1] <=> b[-1]} \

Revision as of 16:38, 9 June 2009

Task
Sort stability
You are encouraged to solve this task according to the task description, using any language you may know.

When sorting records in a table by a particular column or field, a stable sort will always retain the relative order of records that have the same key.

For example, in this table of countries and cities, a stable sort on the second column, the cities, would keep the US Birmingham above the UK Birmingham. (Although an unstable sort might, in this case, place the US Birmingham above the UK Birmingham, a stable sort routine would guarantee it).

UK  London
US  New York
US  Birmingham
UK  Birmingham

The task is to examine the documentation on any in-built sort routines supplied by a language and indicate if an in-built routine is supplied, and if supplied, whether it is stable or not. (This Wikipedia table shows the stability of some common sort routines).

C++

C++ standard library's std::sort() function is not guaranteed stable. The stable analog of it is the std::stable_sort() function. In addition, std::list's sort() method is guaranteed stable.

Haskell

Haskell's sort and sortBy functions are guaranteed stable.[1]

Java

Java's Collections.sort() and Arrays.sort() methods are guaranteed stable.

OCaml

OCaml's List.sort and Array.sort functions are not guaranteed to be stable. The stable versions are List.stable_sort and Array.stable_sort, respectively.

Perl

The stability of Perl's in-built sort function is version-dependent. If you want to guarantee a stable sort from it, you should use the following sort pragma: <lang perl>use sort 'stable';</lang>

Python

Python's in-built sorted function as well as the sort method of lists are guaranteed stable (since version 2.3). (For even more information on the underlying routine, see [this]).

Ruby

Ruby's built-in sort methods use quicksort which is not stable[1]. <lang ruby>ary = [["UK", "London"], ["US", "New York"], ["US", "Birmingham"], ["UK", "Birmingham"]] ary.sort {|a,b| a[1] <=> b[1]}

  1. => [["UK", "Birmingham"], ["US", "Birmingham"], ["UK", "London"], ["US", "New York"]]</lang>

To code a stable sort[2]: <lang ruby>class Array

 def stable_sort
   n = 0
   c = lambda { |x| n+= 1; [x, n]}
   if block_given?
     sort { |a, b|
       yield(c.call(a), c.call(b))        
     }
   else
     sort_by &c
   end
 end

end</lang>

However, even that is insufficiently general. Consider: <lang ruby>ary.stable_sort {|a,b| a[1] <=> b[1]}

  1. => [["UK", "London"], ["US", "New York"], ["US", "Birmingham"], ["UK", "Birmingham"]]</lang>

Above, we end up sorting by the generated index number because the temporary array has nested the original array elements "x" into subarrays "[x, n]". We are required to know the structure of the temporary array: <lang ruby>ary.stable_sort {|a,b| (a[0][1] <=> b[0][1]).nonzero? or a[1]<=>b[1]}

  1. => [["US", "Birmingham"], ["UK", "Birmingham"], ["UK", "London"], ["US", "New York"]]</lang>

So, if we want to use Ruby's sort and we want stability, we might be better off using Merge sort or a Schwartzian transform: <lang ruby>ary.each_index \

  .map  {|i| ary[i] + [i]} \
  .sort {|a,b| (a[1] <=> b[1]).nonzero? or a[-1] <=> b[-1]} \
  .map  {|x| x.pop; x}
  1. => [["US", "Birmingham"], ["UK", "Birmingham"], ["UK", "London"], ["US", "New York"]]</lang>

Tcl

Tcl's built-in lsort command implements a stable sort. It has been guaranteed to be stable since Tcl 8.0. Internally, it uses the mergesort algorithm.