Algebraic data types

From Rosetta Code
Revision as of 22:10, 11 August 2010 by rosettacode>NevilleDNZ (copy edit - wiki link to wikipedia)
Task
Algebraic data types
You are encouraged to solve this task according to the task description, using any language you may know.

Some languages offer direct support for algebraic data types and pattern matching on them. While this of course can always be simulated with manual tagging and conditionals, it allows for terse code which is easy to read, and can represent the algorithm directly.

As an example, implement insertion in a red-black-tree. A red-black-tree is a binary tree where each internal node has a color attribute red or black. Moreover, no red node can have a red child, and every path from the root to an empty node must contain the same number of black nodes. As a consequence, the tree is balanced, and must be re-balanced after an insertion.

Common Lisp

Common Lisp doesn't come with any pattern-matching solutions on its own, but with the help of its macro facility, it can incorporate features from other languages such as pattern matching. Macros expand into efficient code during compilation time and there isn't much difference if it's included in the core language or not. As has been said, Lisp is a ball of mud and remains one no matter what one throws at it.

This is a straighforward translation of the TCL solution. I don't know red-black-trees myself but I tried mirroring the original program as closely as possible. It uses a pattern-matching library called toadstool.

Library: toadstool

<lang lisp>(mapc #'use-package '(#:toadstool #:toadstool-system)) (defstruct (red-black-tree (:constructor tree (color left val right)))

 color left val right)

(defcomponent tree (operator macro-mixin)) (defexpand tree (color left val right)

 `(class red-black-tree red-black-tree-color ,color
                        red-black-tree-left ,left
                        red-black-tree-val ,val
                        red-black-tree-right ,right))

(pushnew 'tree *used-components*)

(defun balance (color left val right)

 (toad-ecase (color left val right)
   (('black (tree 'red (tree 'red a x b) y c) z d)
    (tree 'red (tree 'black a x b) y
          (tree 'black c z d)))
   (('black (tree 'red a x (tree 'red b y c)) z d)
    (tree 'red (tree 'black a x b) y (tree 'black c z d)))
   (('black a x (tree 'red (tree 'red b y c) z d))
    (tree 'red (tree 'black a x b) y (tree 'black c z d)))
   (('black a x (tree 'red b y (tree 'red c z d)))
    (tree 'red (tree 'black a x b) y (tree 'black c z d)))
   ((color a x b)
    (tree color a x b))))

(defun %insert (x s)

 (toad-ecase1 s
   (nil (tree 'red nil x nil))
   ((tree color a y b)
    (cond ((< x y)
           (balance color (%insert x a) y b))
          ((> x y)
           (balance color a y (%insert x b)))
          (t s)))))

(defun insert (x s)

 (toad-ecase1 (%insert x s)
   ((tree t a y b) (tree 'black a y b))))</lang>

E

Translation of: Haskell

In E, a pattern can be used almost anywhere a variable name can. Additionally, there are two operators used for pattern matching idioms: =~ (returns success as a boolean, somewhat like Perl's =~), and switch (matches multiple patterns, like Haskell's case).

Both of those operators are defined in terms of the basic bind/match operation: def pattern exit failure_handler := specimen

def balance(tree) {
  return if (
    tree =~ term`tree(black, tree(red, tree(red, @a, @x, @b), @y, @c), @z, @d)` ||
    tree =~ term`tree(black, tree(red, @a, @x, tree(red, @b, @y, @c)), @z, @d)` ||
    tree =~ term`tree(black, @a, @x, tree(red, tree(red, @b, @y, @c), @z, @d))` ||
    tree =~ term`tree(black, @a, @x, tree(red, @b, @y, tree(red, @c, @z, @d)))`
  ) {
    term`tree(red, tree(black, $a, $x, $b), $y, tree(black, $c, $z, $d))`
  } else { tree }
}
def insert(elem, tree) {
  def ins(tree) {
    return switch (tree) {
      match term`empty` { term`tree(red, empty, $elem, empty)` }
      match term`tree(@color, @a, @y, @b)` {
        if (elem < y) {
          balance(term`tree($color, ${ins(a)}, $y, $b)`)
        } else if (elem > y) {
          balance(term`tree($color, $a, $y, ${ins(b)})`)
        } else {
          tree
        }
      }
    }
  }
  def term`tree(@_, @a, @y, @b)` := ins(tree)
  return term`tree(black, $a, $y, $b)`
}

This code was tested by filling a tree with random values; you can try this at the E REPL:

? var tree := term`empty`
> for _ in 1..20 {
>   tree := insert(entropy.nextInt(100), tree)
> }
> tree

Haskell

<lang haskell>data Color = R | B data Tree a = E | T Color (Tree a) a (Tree a)

balance :: Color -> Tree a -> a -> Tree a -> Tree a balance B (T R (T R a x b) y c ) z d = T R (T B a x b) y (T B c z d) balance B (T R a x (T R b y c)) z d = T R (T B a x b) y (T B c z d) balance B a x (T R (T R b y c) z d ) = T R (T B a x b) y (T B c z d) balance B a x (T R b y (T R c z d)) = T R (T B a x b) y (T B c z d) balance col a x b = T col a x b

insert :: Ord a => a -> Tree a -> Tree a insert x s = T B a y b where

 ins E          =  T R E x E
 ins s@(T col a y b) 
   | x < y      =  balance col (ins a) y b
   | x > y      =  balance col a y (ins b)
   | otherwise  =  s
 T _ a y b = ins s</lang>

OCaml

<lang ocaml> type color = R | B type 'a tree = E | T of color * 'a tree * 'a * 'a tree

(** val balance : color * 'a tree * 'a * 'a tree -> 'a tree *) let balance = function

 | B, T (R, T (R,a,x,b), y, c), z, d
 | B, T (R, a, x, T (R,b,y,c)), z, d
 | B, a, x, T (R, T (R,b,y,c), z, d)
 | B, a, x, T (R, b, y, T (R,c,z,d)) -> T (R, T (B,a,x,b), y, T (B,c,z,d))
 | col, a, x, b                      -> T (col, a, x, b) 

(** val insert : 'a -> 'a tree -> 'a tree *) let insert x s =

 let rec ins = function
   | E                  -> T (R,E,x,E)
   | T (col,a,y,b) as s ->

if x < y then balance (col, ins a, y, b) else if x > y then balance (col, a, y, ins b) else s

 in let T (_,a,y,b) = ins s 
 in T (B,a,y,b)

</lang>


Oz

Translation of: Haskell

Unlike Haskell, Oz does not support multiple equations per function. So we use an explicit case-statement. To match multiple variables at once, we create temporary tuples with "#".

<lang oz>fun {Balance Col A X B}

  case Col#A#X#B
  of b#t(r t(r A X B) Y C         )#Z#D                            then t(r t(b A X B) Y t(b C Z D))
  [] b#t(r A          X t(r B Y C))#Z#D                            then t(r t(b A X B) Y t(b C Z D))
  [] b#A                           #X#t(r t(r B Y C) Z D)          then t(r t(b A X B) Y t(b C Z D))
  [] b#A                           #X#t(r B          Y t(r C Z D)) then t(r t(b A X B) Y t(b C Z D))
  else t(Col A X B)
  end

end

fun {Insert X S}

  fun {Ins S}
     case S of e then t(r e X e)
     [] t(Col A Y B) then

if X < Y then {Balance Col {Ins A} Y B} elseif X > Y then {Balance Col A Y {Ins B}} else S end

     end
  end
  t(_ A Y B) = {Ins S}

in

  t(b A Y B)

end</lang>

Prolog

color(r).
color(b).

tree(_,e).
tree(P,t(C,L,X,R)) :- color(C), tree(P,L), call(P,X), tree(P,R).

bal(b, t(r,t(r,A,X,B),Y,C), Z, D, t(r,t(b,A,X,B),Y,t(b,C,Z,D))).
bal(b, t(r,A,X,t(r,B,Y,C)), Z, D, t(r,t(b,A,X,B),Y,t(b,C,Z,D))).
bal(b, A, X, t(r,t(r,B,Y,C),Z,D), t(r,t(b,A,X,B),Y,t(b,C,Z,D))).
bal(b, A, X, t(r,B,Y,t(r,C,Z,D)), t(r,t(b,A,X,B),Y,t(b,C,Z,D))).

balance(C,A,X,B,S) :- ( bal(C,A,X,B,T) -> S = T ; S = t(C,A,X,B) ).

ins(X,e,t(r,e,X,e)).
ins(X,t(C,A,Y,B),R) :- ( X < Y -> ins(X,A,Ao), balance(C,Ao,Y,B,R)
                       ; X > Y -> ins(X,B,Bo), balance(C,A,Y,Bo,R)
                       ; X = Y -> R = t(C,A,Y,B)
                       ).

insert(X,S,t(b,A,Y,B)) :- ins(X,S,t(_,A,Y,B)).

Scala

Translation of: Haskell

Algebraic data types are implemented in Scala through the combination of a number of different features, to ensure principles of Object Oriented Programming.

The main type is usually defined as a sealed abstract class, which ensures it can't be instantiated, and guarantees that it can't be expanded outside the file it was defined at. This last feature is used so the compiler can verify that the pattern matching is complete, or warn when there are missing cases. It can be ommitted if preferred.

Each subtype is defined either as a case object, for non-paremeterized types, or case class, for parameterized types, all extending the main type. The case keyword is not required, but, when used, it provides a number of default methods which ensure they can be used without any further definitions.

The most important of those default methods for the purpose of algebraic data types is the extractor, a method called either unapply or unapplySeq, and which returns an Option containing the deconstructed parameters, or None if the passed object can't be deconstructed by this method. Scala uses the extractors to implement pattern matching without exposing the internal representation of the data.

This specific task is made much harder than necessary because Scala doesn't have a variant ordering class. Given that limitation, one has to either give up on a singleton object representing the empty tree, or give up on parameterizing the tree itself.

The solution below, uses the latter approach. The algebraic data types are members of a RedBlackTree class, which, itself, receives a type parameter for the keys of the tree, and an implicit parameter for an Ordering for that type. To use the tree it is thus necessary to instantiate an object of type RedBlackTree, and then reference the members of that object.

<lang scala>class RedBlackTree[A](implicit ord: Ordering[A]) {

 sealed abstract class Color
 case object R extends Color
 case object B extends Color
 
 sealed abstract class Tree {
   def insert(x: A): Tree = ins(x) match {
     case T(_, a, y, b) => T(B, a, y, b)
     case E             => E
   }
   def ins(x: A): Tree
 }
 
 case object E extends Tree {
   override def ins(x: A): Tree = T(R, E, x, E) 
 }
 
 case class T(c: Color, left: Tree, a: A, right: Tree) extends Tree {
   private def balance: Tree = (c, left, a, right) match {
     case (B, T(R, T(R, a, x, b), y, c),             z, d                                    ) => T(R, T(B, a, x, b), y, T(B, c, z, d))
     case (B, T(R, a,             x, T(R, b, y, c)), z, d                                    ) => T(R, T(B, a, x, b), y, T(B, c, z, d))
     case (B, a,                                     x, T(R, T(R, b, y, c), z, d            )) => T(R, T(B, a, x, b), y, T(B, c, z, d))
     case (B, a,                                     x, T(R, b,             y, T(R, c, z, d))) => T(R, T(B, a, x, b), y, T(B, c, z, d))
     case _ => this
   }
   
   override def ins(x: A): Tree = ord.compare(x, a) match {
     case -1 => T(c, left ins x, a, right      ).balance
     case  1 => T(c, left,       a, right ins x).balance
     case  0 => this
   }
 }

}</lang>

Usage example:

scala> val rbt = new RedBlackTree[Int]
rbt: RedBlackTree[Int] = RedBlackTree@17dfcf1

scala> import rbt._
import rbt._

scala> List.range(1, 17).foldLeft(E: Tree)(_ insert _)
res5: rbt.Tree = T(B,T(B,T(B,T(B,E,1,E),2,T(B,E,3,E)),4,T(B,T(B,E,5,E),6,T(B,E,7,E))),8,T(B,T(B,T(B,E,9,E),10,T(B,E,11,E
)),12,T(B,T(B,E,13,E),14,T(B,E,15,T(R,E,16,E)))))

Standard ML

<lang sml> datatype color = R | B datatype 'a tree = E | T of color * 'a tree * 'a * 'a tree

(** val balance = fn : color * 'a tree * 'a * 'a tree -> 'a tree *) fun balance (B, T (R, T (R,a,x,b), y, c), z, d) = T (R, T (B,a,x,b), y, T (B,c,z,d))

 | balance (B, T (R, a, x, T (R,b,y,c)), z, d) = T (R, T (B,a,x,b), y, T (B,c,z,d))
 | balance (B, a, x, T (R, T (R,b,y,c), z, d)) = T (R, T (B,a,x,b), y, T (B,c,z,d))
 | balance (B, a, x, T (R, b, y, T (R,c,z,d))) = T (R, T (B,a,x,b), y, T (B,c,z,d))
 | balance (col, a, x, b)                      = T (col, a, x, b) 

(** val insert = fn : int -> int tree -> int tree *) fun insert x s = let

 fun ins E                    = T (R,E,x,E)
   | ins (s as T (col,a,y,b)) =

if x < y then balance (col, ins a, y, b) else if x > y then balance (col, a, y, ins b) else s

 val T (_,a,y,b) = ins s 

in

 T (B,a,y,b)

end </lang>

Tcl

Translation of: Haskell

Tcl doesn't have algebraic types built-in, but they can be simulated using tagged lists, and a custom pattern matching control structure can be built: <lang tcl># From http://wiki.tcl.tk/9547 package require Tcl 8.5 package provide datatype 0.1

namespace eval ::datatype {

   namespace export define match matches
   namespace ensemble create
   # Datatype definitions
   proc define {type = args} { 
       set ns [uplevel 1 { namespace current }]
       foreach cons [split [join $args] |] {
           set name [lindex $cons 0]
           set args [lrange $cons 1 end]
           proc $ns\::$name $args [format {
               lreplace [info level 0] 0 0 %s
           } [list $name]]
       }
       return $type
   }
   # Pattern matching
   # matches pattern value envVar --
   #   Returns 1 if value matches pattern, else 0
   #   Binds match variables in envVar
   proc matches {pattern value envVar} {
       upvar 1 $envVar env
       if {[var? $pattern]} { return [bind env $pattern $value] }
       if {[llength $pattern] != [llength $value]} { return 0 }
       if {[lindex $pattern 0] ne [lindex $value 0]} { return 0 }
       foreach pat [lrange $pattern 1 end] val [lrange $value 1 end] {
           if {![matches $pat $val env]} { return 0 }
       }
       return 1
   }
   # A variable starts with lower-case letter or _. _ is a wildcard.
   proc var? term { string match {[a-z_]*} $term }
   proc bind {envVar var value} {
       upvar 1 $envVar env
       if {![info exists env]} { set env [dict create] }
       if {$var eq "_"} { return 1 }
       dict set env $var $value
       return 1
   }
   proc match args {
       #puts "MATCH: $args"
       set values [lrange $args 0 end-1]
       set choices [lindex $args end]
       append choices \n [list return -code error -level 2 "no match for $values"]
       set f [list values $choices [namespace current]]
       lassign [apply $f $values] env body
       #puts "RESULT: $env -> $body"
       dict for {k v} $env { upvar 1 $k var; set var $v }
       catch { uplevel 1 $body } msg opts
       dict incr opts -level
       return -options $opts $msg
   }
   proc case args {
       upvar 1 values values
       set patterns [lrange $args 0 end-2]
       set body [lindex $args end]
       set env [dict create]
       if {[llength $patterns] != [llength $values]} { return }
       foreach pattern $patterns value $values {
           if {![matches $pattern $value env]} { return }
       }
       return -code return [list $env $body]
   }
   proc default body { return -code return [list {} $body] }

} </lang> We can then code our solution similar to Haskell:

<lang tcl>datatype define Color = R | B datatype define Tree = E | T color left val right

  1. balance :: Color -> Tree a -> a -> Tree a -> Tree a

proc balance {color left val right} {

   datatype match $color $left $val $right {
       case B [T R [T R a x b] y c] z d -> { T R [T B $a $x $b] $y [T B $c $z $d] }
       case B [T R a x [T R b y c]] z d -> { T R [T B $a $x $b] $y [T B $c $z $d] }
       case B a x [T R [T R b y c] z d] -> { T R [T B $a $x $b] $y [T B $c $z $d] }
       case B a x [T R b y [T R c z d]] -> { T R [T B $a $x $b] $y [T B $c $z $d] }
       case col a x b                   -> { T $col $a $x $b }
   }

}

  1. insert :: Ord a => a -> Tree a -> Tree a

proc insert {x s} {

   datatype match [ins $x $s] {
       case [T _ a y b]  -> { T B $a $y $b }
   }

}

  1. ins :: Ord a => a -> Tree a -> Tree a

proc ins {x s} {

   datatype match $s {
       case E               -> { T R E $x E }
       case [T col a y b]   -> {
           if {$x < $y} { return [balance $col [ins $x $a] $y $b] }
           if {$x > $y} { return [balance $col $a $y [ins $x $b]] }
           return $s
       }
   }

}</lang>