Huffman coding: Difference between revisions

m
(stripping out dead code)
m (→‎{{header|Wren}}: Minor tidy)
(114 intermediate revisions by 55 users not shown)
Line 1:
{{task|Compression}}
 
Huffman encoding is a way to assign binary codes to symbols that reduces the overall number of bits used to encode a typical string of those symbols.
 
Line 19 ⟶ 20:
# The remaining node is the root node and the tree is complete.
 
<br>
Traverse the constructed binary tree from root to leaves assigning and accumulating a '0' for one branch and a '1' for the other at each node. The accumulated zeros and ones at each leaf constitute a Huffman encoding for those symbols and weights:
 
 
'''Using the characters and their frequency from the string ''"this is an example for huffman encoding"'', create a program to generate a Huffman encoding for each character as a table.'''
;Task:
Using the characters and their frequency from the string:
::::: &nbsp; ''' '' this is an example for huffman encoding '' '''
create a program to generate a Huffman encoding for each character as a table.
<br><br>
 
=={{header|11l}}==
{{trans|Python}}
 
<syntaxhighlight lang="11l">T Element
Int weight
[(Char, String)] symbols
 
F (weight, symbols)
.weight = weight
.symbols = symbols
 
F <(other)
R (.weight, .symbols) < (other.weight, other.symbols)
 
F encode(symb2freq)
V heap = symb2freq.map((sym, wt) -> Element(wt, [(sym, ‘’)]))
minheap:heapify(&heap)
 
L heap.len > 1
V lo = minheap:pop(&heap)
V hi = minheap:pop(&heap)
 
L(&sym) lo.symbols
sym[1] = ‘0’sym[1]
 
L(&sym) hi.symbols
sym[1] = ‘1’sym[1]
 
minheap:push(&heap, Element(lo.weight + hi.weight, lo.symbols [+] hi.symbols))
 
R sorted(minheap:pop(&heap).symbols, key' p -> (p[1].len, p))
 
V txt = ‘this is an example for huffman encoding’
V symb2freq = DefaultDict[Char, Int]()
L(ch) txt
symb2freq[ch]++
 
V huff = encode(symb2freq)
print("Symbol\tWeight\tHuffman Code")
L(p) huff
print("#.\t#.\t#.".format(p[0], symb2freq[p[0]], p[1]))</syntaxhighlight>
 
{{out}}
<pre>
Symbol Weight Huffman Code
6 101
n 4 010
a 3 1001
e 3 1100
f 3 1101
h 2 0001
i 3 1110
m 2 0010
o 2 0011
s 2 0111
g 1 00000
l 1 00001
p 1 01100
r 1 01101
t 1 10000
u 1 10001
x 1 11110
c 1 111110
d 1 111111
</pre>
 
=={{header|Ada}}==
Line 27 ⟶ 100:
 
huffman.ads:
<langsyntaxhighlight Adalang="ada">with Ada.Containers.Indefinite_Ordered_Maps;
with Ada.Containers.Ordered_Maps;
with Ada.Finalization;
Line 109 ⟶ 182:
-- free memory after finalization
overriding procedure Finalize (Object : in out Huffman_Tree);
end Huffman;</langsyntaxhighlight>
 
huffman.adb:
<langsyntaxhighlight Adalang="ada">with Ada.Text_IO;
with Ada.Unchecked_Deallocation;
with Ada.Containers.Vectors;
Line 355 ⟶ 428:
end loop;
end Dump_Encoding;
end Huffman;</langsyntaxhighlight>
 
example main.adb:
<langsyntaxhighlight Adalang="ada">with Ada.Text_IO;
with Huffman;
procedure Main is
Line 406 ⟶ 479:
(Char_Natural_Huffman_Tree.Decode (Tree => Tree, Code => Code));
end;
end Main;</langsyntaxhighlight>
 
{{out}}
Output:
<pre> = 101
a = 1001
Line 432 ⟶ 505:
 
=={{header|BBC BASIC}}==
{{incorrect|BBC BASIC|Huffman code can not contain another code as a prefix}}
{{works with|BBC BASIC for Windows}}
<langsyntaxhighlight lang="bbcbasic"> INSTALL @lib$+"SORTSALIB"
SortUp% = FN_sortSAinit(0,0) : REM Ascending
SortDn% = FN_sortSAinit(1,0) : REM Descending
Line 485 ⟶ 559:
ENDIF
NEXT
END</langsyntaxhighlight>
{{out}}
'''Output:'''
<pre>
101
Line 507 ⟶ 581:
g 11011
t 11010
</pre>
 
=={{header|Bracmat}}==
<syntaxhighlight lang="bracmat">( "this is an example for huffman encoding":?S
& 0:?chars
& 0:?p
& ( @( !S
: ?
( [!p %?char [?p ?
& !char+!chars:?chars
& ~
)
)
|
)
& 0:?prioritized
& whl
' ( !chars:?n*%@?w+?chars
& (!n.!w)+!prioritized:?prioritized
)
& whl
' ( !prioritized:(?p.?x)+(?q.?y)+?nprioritized
& (!p+!q.(!p.0,!x)+(!q.1,!y))+!nprioritized:?prioritized
)
& 0:?L
& ( walk
= bits tree bit subtree
. !arg:(?bits.?tree)
& whl
' ( !tree:(?p.?bit,?subtree)+?tree
& ( !subtree:@
& (!subtree.str$(!bits !bit))+!L:?L
| walk$(!bits !bit.!subtree)
)
)
)
& !prioritized:(?.?prioritized)
& walk$(.!prioritized)
& lst$L
& :?encoded
& 0:?p
& ( @( !S
: ?
( [!p %?char [?p ?
& !L:?+(!char.?code)+?
& !encoded !code:?encoded
& ~
)
)
| out$(str$!encoded)
)
& ( decode
= char bits
. !L
: ?+(?char.?bits&@(!arg:!bits ?arg))+?
& !char decode$!arg
| !arg
)
& out$("decoded:" str$(decode$(str$!encoded)));</syntaxhighlight>
{{out}}
<pre>(L=
(" ".101)
+ (a.1001)
+ (c.01010)
+ (d.01011)
+ (e.1100)
+ (f.1101)
+ (g.01100)
+ (h.11111)
+ (i.1110)
+ (l.01101)
+ (m.0010)
+ (n.000)
+ (o.0011)
+ (p.01110)
+ (r.01111)
+ (s.0100)
+ (t.10000)
+ (u.10001)
+ (x.11110));
1000011111111001001011110010010110010001011100111101001001001110011011100101110100110111110111111100011101110100101001000101110000001010001101011111000001100
decoded: this is an example for huffman encoding
</pre>
 
Line 513 ⟶ 669:
This code lacks a lot of needed checkings, especially for memory allocation.
 
<langsyntaxhighlight lang="c">#include <stdio.h>
#include <stdlib.h>
#include <string.h>
Line 686 ⟶ 842:
 
return 0;
}</langsyntaxhighlight>
===Alternative===
Using a simple heap-based priority queue. Heap is an array, while ndoe tree is done by binary links.
<langsyntaxhighlight lang="c">#include <stdio.h>
#include <string.h>
 
Line 797 ⟶ 953:
{
int i;
const char *str = "this is an example for huffman encoding", buf[1024];
char buf[1024];
 
init(str);
Line 810 ⟶ 967:
 
return 0;
}</syntaxhighlight>
}</lang>output<lang>' ': 000
{{out}}
<pre>' ': 000
'a': 1000
'c': 01101
Line 830 ⟶ 989:
'x': 01001
encoded: 0111111010011110000000111100000100010100001010100110001111100110100010101000001011101001000011010111000100010111110001010000101101011011110011000011101010000
decoded: this is an example for huffman encoding</langpre>
 
=={{header|C sharp}}==
<langsyntaxhighlight lang="csharp">using System;
using System.Collections.Generic;
 
Line 1,145 ⟶ 1,305:
}
}
}</langsyntaxhighlight>
[[File:CSharpHuffman.jpg]]
 
=={{header|C++}}==
Line 1,152 ⟶ 1,312:
This code builds a tree to generate huffman codes, then prints the codes.
 
<langsyntaxhighlight lang="cpp">#include <iostream>
#include <queue>
#include <map>
Line 1,266 ⟶ 1,426:
}
return 0;
}</langsyntaxhighlight>
 
Sample output:
 
{{out}}
<pre> 110
a 1001
Line 1,291 ⟶ 1,450:
 
=={{header|Clojure}}==
(Updated to 1.6 & includes pretty-printing). Uses Java PriorityQueue
<lang clojure>
<syntaxhighlight lang="clojure">(require '[clojure.pprint :refer :all])
(use 'clojure.contrib.seq-utils)
 
(defn probs [itemss]
(let [freqs (frequencies itemss) sum (reduceapply + (vals freqs))]
(into {} (map (fn [[k v]] [k (/ v sum)]) freqs))))
 
Line 1,307 ⟶ 1,466:
(defn huffman-tree [pq]
(while (> (.size pq) 1)
(let [a (.poll pq) b (.poll pq) new-node { :priority (+ (:priority a) (:priority b)), :left a, :right b }]
new-node {:priority (+ (:priority a) (:priority b)) :left a :right b}]
(.add pq new-node)))
(.poll pq))
 
(defn symbol-map
([t] (into {} (symbol-map t [])""))
([{:keys [symbol, priority left, right] :as t} code]
(if symbol [[{:symbol symbol :weight priority :code code]}]
(concat (symbol-map left (conjstr code \0))
(symbol-map right (conjstr code \1))))))
 
(defn huffman-encode [items]
Line 1,322 ⟶ 1,482:
 
(defn display-huffman-encode [s]
(->> s huffman-encode (sort-by :weight >) print-table))
(println "SYMBOL\tWEIGHT\tHUFFMAN CODE")
(let [probs (probs (seq s))]
(doseq [[char code] (huffman-encode (seq s))]
(printf "%s:\t\t%s\t\t%s\n" char (probs char) (apply str code)))))
 
(display-huffman-encode "this is an example for huffman encoding")</syntaxhighlight>
</lang>
 
{{out}}
Program Output:
<pre>| :symbol | :weight | :code |
<pre>
|---------+---------+--------|
SYMBOL WEIGHT HUFFMAN CODE
: | | 2/13 | 111 |
| n | 4/39 | 011 |
a: 1/13 1001
| a | 1/13 | 1001 |
c: 1/39 00110
| e | 1/13 | 1011 |
d: 1/39 00000
| i | 1/13 | 1100 |
e: 1/13 1011
| f: | 1/13 | 1101 |
| h | 2/39 | 0001 |
g: 1/39 101010
| s | 2/39 | 0010 |
h: 2/39 0001
| m | 2/39 | 0100 |
i: 1/13 1100
| o | 2/39 | 0101 |
l: 1/39 10001
| d | 1/39 | 00000 |
m: 2/39 0100
| t | 1/39 | 00001 |
n: 4/39 011
| c | 1/39 | 00110 |
o: 2/39 0101
| x | 1/39 | 00111 |
p: 1/39 101011
| u | 1/39 | 10000 |
r: 1/39 10100
| l | 1/39 | 10001 |
s: 2/39 0010
| r | 1/39 | 10100 |
t: 1/39 00001
| g | 1/39 | 101010 |
u: 1/39 10000
| p | 1/39 | 101011 |</pre>
x: 1/39 00111
===Alternate Version===
</pre>
Uses c.d.priority-map. Creates a more shallow tree but appears to meet the requirements.
<syntaxhighlight lang="clojure">(require '[clojure.data.priority-map :refer [priority-map-keyfn-by]])
(require '[clojure.pprint :refer [print-table]])
 
(defn init-pq [s]
(let [c (count s)]
(->> s frequencies
(map (fn [[k v]] [k {:sym k :weight (/ v c)}]))
(into (priority-map-keyfn-by :weight <)))))
 
(defn huffman-tree [pq]
(letfn [(build-step
[pq]
(let [a (second (peek pq)) b (second (peek (pop pq)))
nn {:sym (str (:sym a) (:sym b))
:weight (+ (:weight a) (:weight b))
:left a :right b}]
(assoc (pop (pop pq)) (:sym nn) nn)))]
(->> (iterate build-step pq)
(drop-while #(> (count %) 1))
first vals first)))
 
(defn symbol-map [m]
(letfn [(sym-step
[{:keys [sym weight left right] :as m} code]
(cond (and left right) #(vector (trampoline sym-step left (str code \0))
(trampoline sym-step right (str code \1)))
left #(sym-step left (str code \0))
right #(sym-step right (str code \1))
:else {:sym sym :weight weight :code code}))]
(trampoline sym-step m "")))
 
(defn huffman-encode [s]
(->> s init-pq huffman-tree symbol-map flatten))
 
(defn display-huffman-encode [s]
(->> s huffman-encode (sort-by :weight >) print-table))
 
(display-huffman-encode "this is an example for huffman encoding")</syntaxhighlight>
{{out}}
<pre>| :sym | :weight | :code |
|------+---------+-------|
| | 2/13 | 101 |
| n | 4/39 | 010 |
| a | 1/13 | 1001 |
| i | 1/13 | 1101 |
| e | 1/13 | 1110 |
| f | 1/13 | 1111 |
| m | 2/39 | 0000 |
| o | 2/39 | 0001 |
| s | 2/39 | 0010 |
| h | 2/39 | 11001 |
| g | 1/39 | 00110 |
| l | 1/39 | 00111 |
| t | 1/39 | 01100 |
| u | 1/39 | 01101 |
| c | 1/39 | 01110 |
| d | 1/39 | 01111 |
| p | 1/39 | 10000 |
| r | 1/39 | 10001 |
| x | 1/39 | 11000 |</pre>
 
=={{header|CoffeeScript}}==
<langsyntaxhighlight lang="coffeescript">
huffman_encoding_table = (counts) ->
# counts is a hash where keys are characters and
Line 1,441 ⟶ 1,658:
console.log "#{rpad(code, 5)}: #{c} (#{counts[c]})"
console.log()
</langsyntaxhighlight>
 
{{out}}
output
<pre>
 
<lang>
> coffee huffman.coffee
---- this is an example for huffman encoding
Line 1,480 ⟶ 1,696:
101 : c (4)
11 : d (8)
</pre>
 
</lang>
 
 
=={{header|Common Lisp}}==
 
This implementation uses a tree built of <code>huffman-node</code>s, and a hash table mapping from elements of the input sequence to <code>huffman-node</code>s. The priority queue is implemented as a sorted list. (For a more efficient implementation of a priority queue, see the [[Heapsort]] task.)
and a hash table mapping from elements of the input sequence to <code>huffman-node</code>s.
The priority queue is implemented as a sorted list.
(For a more efficient implementation of a priority queue, see the [[Heapsort]] task.)
 
<langsyntaxhighlight lang="lisp">(defstruct huffman-node
(weight 0 :type number)
(element nil :type t)
Line 1,547 ⟶ 1,763:
(huffman-node-element node)
(huffman-node-weight node)
(huffman-node-encoding node))))</langsyntaxhighlight>
 
Example:
Line 1,575 ⟶ 1,791:
 
=={{header|D}}==
<langsyntaxhighlight lang="d">import std.stdio, std.algorithm, std.typecons, std.container, std.array;
 
auto encode(Talias eq, R)(Group!("a == b"eq, T[]R) sf) /*pure nothrow @safe*/ {
auto heap = sf.map!(s => tuple(s[1], [tuple(s[0], "")]))
.array.heapify!q{b < a};
Line 1,584 ⟶ 1,800:
auto lo = heap.front; heap.removeFront;
auto hi = heap.front; heap.removeFront;
foreach (ref pair; lo[1].each!((ref pair) => pair[1] = '0' ~ pair[1]);
foreach (ref pair; hi[1].each!((ref pair) => pair[1] = '1' ~ pair[1]);
heap.insert(tuple(lo[0] + hi[0], lo[1] ~ hi[1]));
}
return heap.front[1].schwartzSort!(p =>q{ tuple(pa[1].length, pa[0])) };
}
 
void main() /*@safe*/ {
autoimmutable s = "this is an example for huffman encoding"d;
foreach (const p; s.dup.sort().release.group.encode)
writefln("'%s' %s", p.tupleof[]);
}</langsyntaxhighlight>
{{out}}
<pre>' ' 101
Line 1,616 ⟶ 1,832:
'c' 111110
'd' 111111</pre>
 
=={{header|Eiffel}}==
Adapted C# solution.
<syntaxhighlight lang="eiffel">
class HUFFMAN_NODE[T -> COMPARABLE]
inherit
COMPARABLE
redefine
three_way_comparison
end
create
leaf_node, inner_node
feature {NONE}
leaf_node (a_probability: REAL_64; a_value: T)
do
probability := a_probability
value := a_value
is_leaf := true
 
left := void
right := void
parent := void
end
 
inner_node (a_left, a_right: HUFFMAN_NODE[T])
do
left := a_left
right := a_right
 
a_left.parent := Current
a_right.parent := Current
a_left.is_zero := true
a_right.is_zero := false
 
probability := a_left.probability + a_right.probability
is_leaf := false
end
 
feature
probability: REAL_64
value: detachable T
 
 
is_leaf: BOOLEAN
is_zero: BOOLEAN assign set_is_zero
 
set_is_zero (a_value: BOOLEAN)
do
is_zero := a_value
end
 
left: detachable HUFFMAN_NODE[T]
right: detachable HUFFMAN_NODE[T]
parent: detachable HUFFMAN_NODE[T] assign set_parent
 
set_parent (a_parent: detachable HUFFMAN_NODE[T])
do
parent := a_parent
end
 
is_root: BOOLEAN
do
Result := parent = void
end
 
bit_value: INTEGER
do
if is_zero then
Result := 0
else
Result := 1
end
end
feature -- comparable implementation
is_less alias "<" (other: like Current): BOOLEAN
do
Result := three_way_comparison (other) = -1
end
 
three_way_comparison (other: like Current): INTEGER
do
Result := -probability.three_way_comparison (other.probability)
end
end
 
class HUFFMAN
create
make
feature {NONE}
make(a_string: STRING)
require
non_empty_string: a_string.count > 0
local
l_queue: HEAP_PRIORITY_QUEUE[HUFFMAN_NODE[CHARACTER]]
l_counts: HASH_TABLE[INTEGER, CHARACTER]
l_node: HUFFMAN_NODE[CHARACTER]
l_left, l_right: HUFFMAN_NODE[CHARACTER]
do
create l_queue.make (a_string.count)
create l_counts.make (10)
 
across a_string as char
loop
if not l_counts.has (char.item) then
l_counts.put (0, char.item)
end
l_counts.replace (l_counts.at (char.item) + 1, char.item)
end
 
create leaf_dictionary.make(l_counts.count)
 
across l_counts as kv
loop
create l_node.leaf_node ((kv.item * 1.0) / a_string.count, kv.key)
l_queue.put (l_node)
leaf_dictionary.put (l_node, kv.key)
end
 
from
until
l_queue.count <= 1
loop
l_left := l_queue.item
l_queue.remove
l_right := l_queue.item
l_queue.remove
 
create l_node.inner_node (l_left, l_right)
l_queue.put (l_node)
end
 
root := l_queue.item
root.is_zero := false
end
feature
root: HUFFMAN_NODE[CHARACTER]
leaf_dictionary: HASH_TABLE[HUFFMAN_NODE[CHARACTER], CHARACTER]
 
encode(a_value: CHARACTER): STRING
require
encodable: leaf_dictionary.has (a_value)
local
l_node: HUFFMAN_NODE[CHARACTER]
do
Result := ""
if attached leaf_dictionary.item (a_value) as attached_node then
l_node := attached_node
from
 
until
l_node.is_root
loop
Result.append_integer (l_node.bit_value)
if attached l_node.parent as parent then
l_node := parent
end
end
 
Result.mirror
end
end
end
 
class
APPLICATION
create
make
 
feature {NONE}
make -- entry point
local
l_str: STRING
huff: HUFFMAN
chars: BINARY_SEARCH_TREE_SET[CHARACTER]
do
l_str := "this is an example for huffman encoding"
 
create huff.make (l_str)
 
create chars.make
chars.fill (l_str)
 
from
chars.start
until
chars.off
loop
print (chars.item.out + ": " + huff.encode (chars.item) + "%N")
chars.forth
end
end
end
</syntaxhighlight>
 
{{out}}
<pre> : 101
a: 1001
c: 01110
d: 01111
e: 1111
f: 1100
g: 01001
h: 11101
i: 1101
l: 10001
m: 0010
n: 000
o: 0011
p: 10000
r: 11100
s: 0110
t: 01000
u: 01011
x: 01010
</pre>
 
=={{header|Erlang}}==
The main part of the code used here is extracted from [https://gist.github.com/rmies/2828351 Michel Rijnders' GitHubGist]. See also [http://codingpraxis.com/erlang/2012/10/23/huffman-coding-in-erlang.html his blog], for a complete description of the original module.
<syntaxhighlight lang="erlang">-module(huffman).
-export([encode/1, decode/2, main/0]).
 
encode(Text) ->
Tree = tree(freq_table(Text)),
Dict = dict:from_list(codewords(Tree)),
Code = << <<(dict:fetch(Char, Dict))/bitstring>> || Char <- Text >>,
{Code, Tree, Dict}.
decode(Code, Tree) ->
decode(Code, Tree, Tree, []).
 
main() ->
{Code, Tree, Dict} = encode("this is an example for huffman encoding"),
[begin
io:format("~s: ",[[Key]]),
print_bits(Value)
end || {Key, Value} <- lists:sort(dict:to_list(Dict))],
io:format("encoded: "),
print_bits(Code),
io:format("decoded: "),
io:format("~s\n",[decode(Code, Tree)]).
decode(<<>>, _, _, Result) ->
lists:reverse(Result);
decode(<<0:1, Rest/bits>>, Tree, {L = {_, _}, _R}, Result) ->
decode(<<Rest/bits>>, Tree, L, Result);
decode(<<0:1, Rest/bits>>, Tree, {L, _R}, Result) ->
decode(<<Rest/bits>>, Tree, Tree, [L | Result]);
decode(<<1:1, Rest/bits>>, Tree, {_L, R = {_, _}}, Result) ->
decode(<<Rest/bits>>, Tree, R, Result);
decode(<<1:1, Rest/bits>>, Tree, {_L, R}, Result) ->
decode(<<Rest/bits>>, Tree, Tree, [R | Result]).
codewords({L, R}) ->
codewords(L, <<0:1>>) ++ codewords(R, <<1:1>>).
codewords({L, R}, <<Bits/bits>>) ->
codewords(L, <<Bits/bits, 0:1>>) ++ codewords(R, <<Bits/bits, 1:1>>);
codewords(Symbol, <<Bits/bitstring>>) ->
[{Symbol, Bits}].
tree([{N, _} | []]) ->
N;
tree(Ns) ->
[{N1, C1}, {N2, C2} | Rest] = lists:keysort(2, Ns),
tree([{{N1, N2}, C1 + C2} | Rest]).
freq_table(Text) ->
freq_table(lists:sort(Text), []).
freq_table([], Acc) ->
Acc;
freq_table([S | Rest], Acc) ->
{Block, MoreBlocks} = lists:splitwith(fun (X) -> X == S end, Rest),
freq_table(MoreBlocks, [{S, 1 + length(Block)} | Acc]).
 
print_bits(<<>>) ->
io:format("\n");
print_bits(<<Bit:1, Rest/bitstring>>) ->
io:format("~w", [Bit]),
print_bits(Rest).</syntaxhighlight>
{{out}}
<pre> : 111
a: 1011
c: 10010
d: 100111
e: 1010
f: 1101
g: 100110
h: 1000
i: 1100
l: 00001
m: 0101
n: 001
o: 0100
p: 00000
r: 00011
s: 0111
t: 00010
u: 01101
x: 01100
encoded: 0001010001100011111111000111111101100111110100110010110101000000000110101111101010000011111100001101110111010101101100111110100011001001001001111100001100110
decoded: this is an example for huffman encoding</pre>
 
=={{header|F_Sharp|F#}}==
{{Trans|OCaml}}
<langsyntaxhighlight lang="fsharp">type 'a HuffmanTree =
| Leaf of int * 'a
| Node of int * 'a HuffmanTree * 'a HuffmanTree
Line 1,653 ⟶ 2,172:
let tree = charFreqs |> Map.toList |> buildTree
printfn "Symbol\tWeight\tHuffman code";
printTree ([], tree)</langsyntaxhighlight>
{{out}}
Output:
<pre>Symbol Weight Huffman code
p 1 00000
Line 1,675 ⟶ 2,194:
e 3 1101
6 111</pre>
 
=={{header|Factor}}==
<syntaxhighlight lang="factor">
USING: kernel sequences combinators accessors assocs math hashtables math.order
sorting.slots classes formatting prettyprint ;
 
IN: huffman
 
! -------------------------------------
! CLASSES -----------------------------
! -------------------------------------
 
TUPLE: huffman-node
weight element encoding left right ;
 
! For nodes
: <huffman-tnode> ( left right -- huffman )
huffman-node new [ left<< ] [ swap >>right ] bi ;
 
! For leafs
: <huffman-node> ( element -- huffman )
1 swap f f f huffman-node boa ;
 
 
! --------------------------------------
! INITIAL HASHTABLE --------------------
! --------------------------------------
 
<PRIVATE
 
! Increment node if it already exists
! Else make it and add it to the hash-table
: huffman-gen ( element nodes -- )
2dup at
[ [ [ 1 + ] change-weight ] change-at ]
[ [ dup <huffman-node> swap ] dip set-at ] if ;
 
! Curry node-hash. Then each over the seq
! to get the weighted values
: (huffman) ( nodes seq -- nodes )
dup [ [ huffman-gen ] curry each ] dip ;
 
! ---------------------------------------
! TREE GENERATION -----------------------
! ---------------------------------------
 
: (huffman-weight) ( node1 node2 -- weight )
[ weight>> ] dup bi* + ;
 
! Combine two nodes into the children of a parent
! node which has a weight equal to their collective
! weight
: (huffman-combine) ( node1 node2 -- node3 )
[ (huffman-weight) ]
[ <huffman-tnode> ] 2bi
swap >>weight ;
 
! Generate a tree by combining nodes
! in the priority queue until we're
! left with the root node
: (huffman-tree) ( nodes -- tree )
dup rest empty?
[ first ] [
{ { weight>> <=> } } sort-by
[ rest rest ] [ first ]
[ second ] tri
(huffman-combine) prefix
(huffman-tree)
] if ; recursive
 
! --------------------------------------
! ENCODING -----------------------------
! --------------------------------------
 
: (huffman-leaf?) ( node -- bool )
[ left>> huffman-node instance? ]
[ right>> huffman-node instance? ] bi and not ;
 
: (huffman-leaf) ( leaf bit -- )
swap encoding<< ;
 
DEFER: (huffman-encoding)
 
! Recursively walk the nodes left and right
: (huffman-node) ( bit nodes -- )
[ 0 suffix ] [ 1 suffix ] bi
[ [ left>> ] [ right>> ] bi ] 2dip
[ swap ] dip
[ (huffman-encoding) ] 2bi@ ;
 
: (huffman-encoding) ( bit nodes -- )
over (huffman-leaf?)
[ (huffman-leaf) ]
[ (huffman-node) ] if ;
 
PRIVATE>
 
! -------------------------------
! USER WORDS --------------------
! -------------------------------
 
: huffman-print ( nodes -- )
"Element" "Weight" "Code" "\n%10s\t%10s\t%6s\n" printf
{ { weight>> >=< } } sort-by
[ [ encoding>> ] [ element>> ] [ weight>> ] tri
"%8c\t%7d\t\t" printf pprint "\n" printf ] each ;
 
: huffman ( sequence -- nodes )
H{ } clone (huffman) values
[ (huffman-tree) { } (huffman-encoding) ] keep ;
 
! ---------------------------------
! USAGE ---------------------------
! ---------------------------------
 
! { 1 2 3 4 } huffman huffman-print
! "this is an example of a huffman tree" huffman huffman-print
 
! Element Weight Code
! 7 { 0 0 0 }
! a 4 { 1 1 1 }
! e 4 { 1 1 0 }
! f 3 { 0 0 1 0 }
! h 2 { 1 0 1 0 }
! i 2 { 0 1 0 1 }
! m 2 { 0 1 0 0 }
! n 2 { 0 1 1 1 }
! s 2 { 0 1 1 0 }
! t 2 { 0 0 1 1 }
! l 1 { 1 0 1 1 1 }
! o 1 { 1 0 1 1 0 }
! p 1 { 1 0 0 0 1 }
! r 1 { 1 0 0 0 0 }
! u 1 { 1 0 0 1 1 }
! x 1 { 1 0 0 1 0 }
</syntaxhighlight>
 
=={{header|Fantom}}==
 
<langsyntaxhighlight lang="fantom">
class Node
{
Line 1,776 ⟶ 2,431:
}
}
</syntaxhighlight>
</lang>
 
{{out}}
Output:
<pre>
From "this is an example for huffman encoding"
Line 1,802 ⟶ 2,457:
x -> 01011
</pre>
 
=={{header|Fortran}}==
 
<syntaxhighlight lang="fortran">! output:
! d-> 00000, t-> 00001, h-> 0001, s-> 0010,
! c-> 00110, x-> 00111, m-> 0100, o-> 0101,
! n-> 011, u-> 10000, l-> 10001, a-> 1001,
! r-> 10100, g-> 101010, p-> 101011,
! e-> 1011, i-> 1100, f-> 1101, -> 111
!
! 00001|0001|1100|0010|111|1100|0010|111|1001|011|
! 111|1011|00111|1001|0100|101011|10001|1011|111|
! 1101|0101|10100|111|0001|10000|1101|1101|0100|
! 1001|011|111|1011|011|00110|0101|00000|1100|011|101010|
!
module huffman
implicit none
type node
character (len=1 ), allocatable :: sym(:)
character (len=10), allocatable :: code(:)
integer :: freq
contains
procedure :: show => show_node
end type
 
type queue
type(node), allocatable :: buf(:)
integer :: n = 0
contains
procedure :: extractmin
procedure :: append
procedure :: siftdown
end type
 
contains
 
subroutine siftdown(this, a)
class (queue) :: this
integer :: a, parent, child
associate (x => this%buf)
parent = a
do while(parent*2 <= this%n)
child = parent*2
if (child + 1 <= this%n) then
if (x(child+1)%freq < x(child)%freq ) then
child = child +1
end if
end if
if (x(parent)%freq > x(child)%freq) then
x([child, parent]) = x([parent, child])
parent = child
else
exit
end if
end do
end associate
end subroutine
 
function extractmin(this) result (res)
class(queue) :: this
type(node) :: res
res = this%buf(1)
this%buf(1) = this%buf(this%n)
this%n = this%n - 1
call this%siftdown(1)
end function
 
subroutine append(this, x)
class(queue), intent(inout) :: this
type(node) :: x
type(node), allocatable :: tmp(:)
integer :: i
this%n = this%n +1
if (.not.allocated(this%buf)) allocate(this%buf(1))
if (size(this%buf)<this%n) then
allocate(tmp(2*size(this%buf)))
tmp(1:this%n-1) = this%buf
call move_alloc(tmp, this%buf)
end if
this%buf(this%n) = x
i = this%n
do
i = i / 2
if (i==0) exit
call this%siftdown(i)
end do
end subroutine
 
function join(a, b) result(c)
type(node) :: a, b, c
integer :: i, n, n1
n1 = size(a%sym)
n = n1 + size(b%sym)
c%freq = a%freq + b%freq
allocate (c%sym(n), c%code(n))
do i = 1, n1
c%sym(i) = a%sym(i)
c%code(i) = "0" // trim(a%code(i))
end do
do i = 1, size(b%sym)
c%sym(i+n1) = b%sym(i)
c%code(i+n1) = "1" // trim(b%code(i))
end do
end function
 
subroutine show_node(this)
class(node) :: this
integer :: i
write(*, "(*(g0,'-> ',g0,:,', '))", advance="no") &
(this%sym(i), trim(this%code(i)), i=1,size(this%sym))
print *
end subroutine
 
function create(letter, freq) result (this)
character :: letter
integer :: freq
type(node) :: this
allocate(this%sym(1), this%code(1))
this%sym(1) = letter ; this%code(1) = ""
this%freq = freq
end function
end module
 
program main
use huffman
character (len=*), parameter :: txt = &
"this is an example for huffman encoding"
integer :: i, freq(0:255) = 0
type(queue) :: Q
type(node) :: x
do i = 1, len(txt)
freq(ichar(txt(i:i))) = freq(ichar(txt(i:i))) + 1
end do
do i = 0, 255
if (freq(i)>0) then
call Q%append(create(char(i), freq(i)))
end if
end do
do i = 1, Q%n-1
call Q%append(join(Q%extractmin(),Q%extractmin()))
end do
x = Q%extractmin()
call x%show()
do i = 1, len(txt)
do k = 1, size(x%sym)
if (x%sym(k)==txt(i:i)) exit
end do
write (*, "(a,'|')", advance="no") trim(x%code(k))
end do
print *
end program
</syntaxhighlight>
 
=={{header|FreeBASIC}}==
<syntaxhighlight lang="freebasic">type block
freq as uinteger
chars as string
end type
 
type code
char as string*1
code as string
end type
 
sub bubble( lst() as block, n_l as uinteger )
for j as integer = n_l-1 to 0 step -1
if j>0 andalso lst(j).freq > lst(j-1).freq then
swap lst(j), lst(j-1)
endif
next j
end sub
 
dim as string Sample = "this is an example for huffman encoding"
redim as block hufflist(0)
hufflist(0).freq = 1 : hufflist(0).chars = mid(Sample,1,1)
dim as boolean newchar
dim as string*1 currchar
dim as uinteger n_h = 1, n_c
 
'read characters in one-by-one and simultaneously bubblesort them
for i as uinteger = 2 to len(Sample)
currchar = mid(Sample,i,1)
newchar = true
for j as uinteger = 0 to n_h-1
if mid(Sample,i,1) = hufflist(j).chars then
hufflist(j).freq += 1
newchar = false
end if
if j>0 andalso hufflist(j).freq > hufflist(j-1).freq then
swap hufflist(j), hufflist(j-1)
endif
next j
if newchar then
redim preserve hufflist(0 to n_h)
hufflist(n_h).chars = currchar
hufflist(n_h).freq = 1
n_h+=1
end if
next i
'one final pass of bubblesort may be necessary
bubble hufflist(), n_h
 
'initialise huffman code
redim as code codelist(0 to n_h-1)
for i as uinteger = 0 to n_h-1
codelist(i).char = hufflist(i).chars
codelist(i).code = ""
next i
n_c = n_h
 
do
'characters in the least common block get "0" appended
for i as uinteger = 1 to len(hufflist(n_h-1).chars)
for j as uinteger = 0 to n_c-1
if codelist(j).char = mid(hufflist(n_h-1).chars,i,1) then
codelist(j).code = "0" + codelist(j).code
end if
next j
next i
'characters in the second-least common block get "1" appended
for i as uinteger = 1 to len(hufflist(n_h-2).chars)
for j as uinteger = 0 to n_c-1
if codelist(j).char = mid(hufflist(n_h-2).chars,i,1) then
codelist(j).code = "1" + codelist(j).code
end if
next j
next i
'combine the two least frequent blocks
hufflist(n_h-2).chars = hufflist(n_h-2).chars + hufflist(n_h-1).chars
hufflist(n_h-2).freq = hufflist(n_h-2).freq + hufflist(n_h-1).freq
redim preserve hufflist(0 to n_h-2)
n_h -= 1
'move the new combined block to its proper place in the list
bubble hufflist(), n_h
loop until n_h = 1
 
for i as uinteger = 0 to n_c - 1
print "'"+codelist(i).char+"'", codelist(i).code
next i
</syntaxhighlight>
{{out}}
<pre>' ' 111
'n' 001
'a' 1011
'e' 1010
'f' 1101
'i' 1100
's' 1000
'h' 0101
'm' 0100
'o' 0111
't' 10010
'x' 100111
'p' 100110
'l' 00001
'r' 00000
'u' 00011
'c' 00010
'd' 01101
'g' 01100</pre>
 
=={{header|Go}}==
{{trans|Java}}
<langsyntaxhighlight lang="go">package main
 
import (
Line 1,901 ⟶ 2,816:
fmt.Println("SYMBOL\tWEIGHT\tHUFFMAN CODE")
printCodes(tree, []byte{})
}</langsyntaxhighlight>
{{out}}
Example output:
<pre>
SYMBOL WEIGHT HUFFMAN CODE
Line 1,926 ⟶ 2,841:
</pre>
{{trans|Python}}
<langsyntaxhighlight lang="go">package main
 
import (
Line 1,990 ⟶ 2,905:
fmt.Printf(" %c %d %s\n", c.sym, sym2freq[c.sym], c.code)
}
}</langsyntaxhighlight>
 
=={{header|Groovy}}==
 
Implemented and tested with Groovy 2.3.
 
<syntaxhighlight lang="groovy">
import groovy.transform.*
 
@Canonical
@Sortable(includes = ['freq', 'letter'])
class Node {
String letter
int freq
Node left
Node right
boolean isLeaf() { left == null && right == null }
}
 
Map correspondance(Node n, Map corresp = [:], String prefix = '') {
if (n.isLeaf()) {
corresp[n.letter] = prefix ?: '0'
} else {
correspondance(n.left, corresp, prefix + '0')
correspondance(n.right, corresp, prefix + '1')
}
return corresp
}
 
Map huffmanCode(String message) {
def queue = message.toList().countBy { it } // char frequencies
.collect { String letter, int freq -> // transformed into tree nodes
new Node(letter, freq)
} as TreeSet // put in a queue that maintains ordering
while(queue.size() > 1) {
def (nodeLeft, nodeRight) = [queue.pollFirst(), queue.pollFirst()]
queue << new Node(
freq: nodeLeft.freq + nodeRight.freq,
letter: nodeLeft.letter + nodeRight.letter,
left: nodeLeft, right: nodeRight
)
}
return correspondance(queue.pollFirst())
}
 
String encode(CharSequence msg, Map codeTable) {
msg.collect { codeTable[it] }.join()
}
 
String decode(String codedMsg, Map codeTable, String decoded = '') {
def pair = codeTable.find { k, v -> codedMsg.startsWith(v) }
pair ? pair.key + decode(codedMsg.substring(pair.value.size()), codeTable)
: decoded
}
</syntaxhighlight>
Usage:
<syntaxhighlight lang="groovy">
def message = "this is an example for huffman encoding"
 
def codeTable = huffmanCode(message)
codeTable.each { k, v -> println "$k: $v" }
 
def encoded = encode(message, codeTable)
println encoded
 
def decoded = decode(encoded, codeTable)
println decoded
</syntaxhighlight>
{{out}}
<pre>
g: 00000
l: 00001
h: 0001
m: 0010
o: 0011
n: 010
p: 01100
r: 01101
s: 0111
t: 10000
u: 10001
a: 1001
: 101
e: 1100
f: 1101
i: 1110
x: 11110
c: 111110
d: 111111
1000000011110011110111100111101100101010111001111010010010011000000111001011101001101101101000110001110111010010100101010111000101111100011111111111001000000
this is an example for huffman encoding
 
</pre>
 
=={{header|Haskell}}==
Credits go to [http://www.haskell.org/haskellwiki/99_questions/46_to_50#Problem_50 huffman] where you'll also find a non-tree solution. Uses sorted list as a priority queue.
<langsyntaxhighlight lang="haskell">import Data.List (group, insertBy, sort, sortBy)
import Control.Arrow ((&&&), second)
import Data.Ord (comparing)
 
data HTree a = Leaf a | Branch (HTree a) (HTree a)
= Leaf a
deriving (Show, Eq, Ord)
| Branch (HTree a)
(HTree a)
deriving (Show, Eq, Ord)
 
test :: String -> IO ()
test =
test s = mapM_ (\(a,b)-> putStrLn ('\'' : a : "\' : " ++ b))
mapM_ (\(a, b) -> putStrLn ('\'' : .a serialize: .("' huffmanTree: ." freq++ $b))) s.
serialize . huffmanTree . freq
 
serialize :: HTree a -> [(a, String)]
serialize (Branch l r) = map (second('0':)) (serialize l) ++ map (second('1':)) (serialize r)
(second ('0' :) <$> serialize l) ++ (second ('1' :) <$> serialize r)
serialize (Leaf x) = [(x, "")]
serialize (Leaf x) = [(x, "")]
 
huffmanTree :: (Ord w, Num w) => [(w, a)] -> HTree a
:: (Ord w, Num w)
huffmanTree = snd . head . until (null.tail) hstep
=> [(w, a)] -> HTree a
. sortBy (comparing fst) . map (second Leaf)
huffmanTree =
snd .
head . until (null . tail) hstep . sortBy (comparing fst) . fmap (second Leaf)
 
hstep
hstep :: (Ord a, Num a) => [(a, HTree b)] -> [(a, HTree b)]
:: (Ord a, Num a)
hstep ((w1,t1):(w2,t2):wts) = insertBy (comparing fst) (w1 + w2, Branch t1 t2) wts
=> [(a, HTree b)] -> [(a, HTree b)]
hstep ((w1, t1):(w2, t2):wts) =
insertBy (comparing fst) (w1 + w2, Branch t1 t2) wts
 
freq
freq :: Ord a => [a] -> [(Int, a)]
:: Ord a
freq = map (length &&& head) . group . sort</lang>
=> [a] -> [(Int, a)]
Example output:
freq = fmap (length &&& head) . group . sort
<lang haskell>*Main> test "this is an example for huffman encoding"
 
'p' : 00000
main :: IO ()
main = test "this is an example for huffman encoding"</syntaxhighlight>
{{out}}
<syntaxhighlight lang="haskell">'p' : 00000
'r' : 00001
'g' : 00010
Line 2,038 ⟶ 3,063:
'a' : 1100
'e' : 1101
' ' : 111</langsyntaxhighlight>
 
===Using <code>Set</code> as a priority queue===
(might be worth it for bigger alphabets):
<langsyntaxhighlight lang="haskell">import qualified Data.Set as S
 
htree :: (Ord t, Num t, Ord a) => S.Set (t, HTree a) -> HTree a
Line 2,053 ⟶ 3,078:
huffmanTree :: (Ord w, Num w, Ord a) => [(w, a)] -> HTree a
huffmanTree = htree . S.fromList . map (second Leaf)</langsyntaxhighlight>
 
===A non-tree version===
This produces the output required without building the Huffman tree at all, by building all the trace strings directly while reducing the histogram:
by building all the trace strings directly while reducing the histogram:
<lang haskell>import Data.List (sortBy, insertBy, sort, group)
<syntaxhighlight lang="haskell">import Data.List (sortBy, insertBy, sort, group)
import Control.Arrow (second, (&&&))
import Data.Ord (comparing)
Line 2,073 ⟶ 3,099:
 
main = do
test "this is an example for huffman encoding"</langsyntaxhighlight>
 
=={{header|Icon}} and {{header|Unicon}}==
<langsyntaxhighlight Iconlang="icon">record huffnode(l,r,n,c) # internal and leaf nodes
record huffcode(c,n,b,i) # encoding table char, freq, bitstring, bits (int)
 
Line 2,135 ⟶ 3,161:
}
return T
end</langsyntaxhighlight>
 
{{out}}
Sample output:
<pre>Input String : "this is an example for huffman encoding"
char freq encoding
Line 2,164 ⟶ 3,190:
and implements the algorithm given in the problem description.
The program produces Huffman codes based on each line of input.
<langsyntaxhighlight Uniconlang="unicon">import Collections
 
procedure main(A)
Line 2,193 ⟶ 3,219:
else codeMap[node[1]] := prefix
return codeMap
end</langsyntaxhighlight>
 
A sample run:
Line 2,239 ⟶ 3,265:
'''Solution''' (drawn from [[J:Essays/Huffman_Coding|the J wiki]]):
 
<langsyntaxhighlight lang="j">hc=: 4 : 0
if. 1=#x do. y
else. ((i{x),+/j{x) hc (i{y),<j{y [ i=. (i.#x) -. j=. 2{./:x end.
Line 2,252 ⟶ 3,278:
t=. 0 {:: x hc w NB. minimal weight binary tree
((< S: 0 t) i. w) { <@(1&=)@; S: 1 {:: t
)</langsyntaxhighlight>
 
'''Example''':<langsyntaxhighlight lang="j"> ;"1":L:0(#/.~ (],.(<' '),.hcodes) ,&.>@~.)'this is an example for huffman encoding'
t 0 1 0 1 0
h 1 1 1 1 1
Line 2,273 ⟶ 3,299:
c 1 0 0 0 0
d 1 0 0 0 1
g 1 1 1 1 0</langsyntaxhighlight>
 
=={{header|Java}}==
This implementation creates an actual tree structure, and then traverses the tree to recover the code.
<langsyntaxhighlight lang="java">import java.util.*;
 
abstract class HuffmanTree implements Comparable<HuffmanTree> {
Line 2,371 ⟶ 3,397:
printCodes(tree, new StringBuffer());
}
}</langsyntaxhighlight>
 
{{out}}
Example output:
<pre>SYMBOL WEIGHT HUFFMAN CODE
d 1 00000
Line 2,394 ⟶ 3,420:
f 3 1101
6 111</pre>
 
 
 
 
=={{header|JavaScript}}==
Line 2,406 ⟶ 3,429:
 
The Huffman encoder
<langsyntaxhighlight lang="javascript">function HuffmanEncoding(str) {
this.str = str;
 
Line 2,468 ⟶ 3,491:
}
return decoded;
}</langsyntaxhighlight>
 
And, using the Huffman encoder
<langsyntaxhighlight lang="javascript">var s = "this is an example for huffman encoding";
print(s);
 
Line 2,483 ⟶ 3,506:
print(t);
 
print("is decoded string same as original? " + (s==t));</langsyntaxhighlight>
 
{{out}}
output
<pre style='width: full; overflow: scroll'>this is an example for huffman encoding
'n': 000
Line 2,509 ⟶ 3,532:
this is an example for huffman encoding
is decoded string same as original? true</pre>
 
{{trans|C}}
<syntaxhighlight lang = "javascript">
class node{
constructor(freq, char, left, right){
this.left = left;
this.right = right;
this.freq = freq;
this.c = char;
}
};
 
nodes = [];
code = {};
 
function new_node(left, right){
return new node(left.freq + right.freq, -1, left, right);;
};
 
function qinsert(node){
nodes.push(node);
nodes.sort(compareFunction);
};
 
function qremove(){
return nodes.pop();
};
 
function compareFunction(a, b){
return b.freq - a.freq;
};
 
function build_code(node, codeString, length){
if (node.c != -1){
code[node.c] = codeString;
return;
};
/* Left Branch */
leftCodeString = codeString + "0";
build_code(node.left, leftCodeString, length + 1);
/* Right Branch */
rightCodeString = codeString + "1";
build_code(node.right, rightCodeString, length + 1);
};
 
function init(string){
var i;
var freq = [];
var codeString = "";
for (var i = 0; i < string.length; i++){
if (isNaN(freq[string.charCodeAt(i)])){
freq[string.charCodeAt(i)] = 1;
} else {
freq[string.charCodeAt(i)] ++;
};
};
for (var i = 0; i < freq.length; i++){
if (freq[i] > 0){
qinsert(new node(freq[i], i, null, null));
};
};
while (nodes.length > 1){
qinsert(new_node(qremove(), qremove()));
};
build_code(nodes[0], codeString, 0);
};
 
function encode(string){
output = "";
for (var i = 0; i < string.length; i ++){
output += code[string.charCodeAt(i)];
};
return output;
};
 
function decode(input){
output = "";
node = nodes[0];
for (var i = 0; i < input.length; i++){
if (input[i] == "0"){
node = node.left;
} else {
node = node.right;
};
if (node.c != -1){
output += String.fromCharCode(node.c);
node = nodes[0];
};
};
return output
};
 
 
string = "this is an example of huffman encoding";
console.log("initial string: " + string);
init(string);
for (var i = 0; i < Object.keys(code).length; i++){
if (isNaN(code[Object.keys(code)[i]])){
} else {
console.log("'" + String.fromCharCode(Object.keys(code)[i]) + "'" + ": " + code[Object.keys(code)[i]]);
};
};
 
huffman = encode(string);
console.log("encoded: " + huffman + "\n");
output = decode(huffman);
console.log("decoded: " + output);
</syntaxhighlight>
 
<pre style='width: full; overflow: scroll'>
initial string: this is an example of huffman encoding
' ': 111
'a': 1011
'c': 00101
'd': 00100
'e': 1010
'f': 1101
'g': 00111
'h': 0101
'i': 1100
'l': 00110
'm': 0100
'n': 100
'o': 0111
'p': 00001
's': 0110
't': 00000
'u': 00011
'x': 00010
encoded: 000000101110001101111100011011110111001111010000101011010000001001101010111011111011110101000111101110101001011100111101010000101011100100110010000111
decoded: this is an example of huffman encoding
</pre>
 
=={{header|Julia}}==
<syntaxhighlight lang="julia">
abstract type HuffmanTree end
 
struct HuffmanLeaf <: HuffmanTree
ch::Char
freq::Int
end
 
struct HuffmanNode <: HuffmanTree
freq::Int
left::HuffmanTree
right::HuffmanTree
end
 
function makefreqdict(s::String)
d = Dict{Char, Int}()
for c in s
if !haskey(d, c)
d[c] = 1
else
d[c] += 1
end
end
d
end
 
function huffmantree(ftable::Dict)
trees::Vector{HuffmanTree} = [HuffmanLeaf(ch, fq) for (ch, fq) in ftable]
while length(trees) > 1
sort!(trees, lt = (x, y) -> x.freq < y.freq, rev = true)
least = pop!(trees)
nextleast = pop!(trees)
push!(trees, HuffmanNode(least.freq + nextleast.freq, least, nextleast))
end
trees[1]
end
 
printencoding(lf::HuffmanLeaf, code) = println(lf.ch == ' ' ? "space" : lf.ch, "\t", lf.freq, "\t", code)
 
function printencoding(nd::HuffmanNode, code)
code *= '0'
printencoding(nd.left, code)
code = code[1:end-1]
 
code *= '1'
printencoding(nd.right, code)
code = code[1:end-1]
end
 
const msg = "this is an example for huffman encoding"
 
println("Char\tFreq\tHuffman code")
 
printencoding(huffmantree(makefreqdict(msg)), "")
</syntaxhighlight>{{output}}<pre>
Char Freq Huffman code
p 1 00000
c 1 00001
g 1 00010
x 1 00011
n 4 001
s 2 0100
h 2 0101
u 1 01100
l 1 01101
m 2 0111
o 2 1000
d 1 10010
r 1 100110
t 1 100111
e 3 1010
f 3 1011
a 3 1100
i 3 1101
space 6 111
</pre>
 
=={{header|Kotlin}}==
{{trans|Java}}
This implementation creates an actual tree structure, and then traverses the tree to recover the code.
<syntaxhighlight lang="kotlin">import java.util.*
 
abstract class HuffmanTree(var freq: Int) : Comparable<HuffmanTree> {
override fun compareTo(other: HuffmanTree) = freq - other.freq
}
 
class HuffmanLeaf(freq: Int, var value: Char) : HuffmanTree(freq)
 
class HuffmanNode(var left: HuffmanTree, var right: HuffmanTree) : HuffmanTree(left.freq + right.freq)
 
fun buildTree(charFreqs: IntArray) : HuffmanTree {
val trees = PriorityQueue<HuffmanTree>()
 
charFreqs.forEachIndexed { index, freq ->
if(freq > 0) trees.offer(HuffmanLeaf(freq, index.toChar()))
}
 
assert(trees.size > 0)
while (trees.size > 1) {
val a = trees.poll()
val b = trees.poll()
trees.offer(HuffmanNode(a, b))
}
 
return trees.poll()
}
 
fun printCodes(tree: HuffmanTree, prefix: StringBuffer) {
when(tree) {
is HuffmanLeaf -> println("${tree.value}\t${tree.freq}\t$prefix")
is HuffmanNode -> {
//traverse left
prefix.append('0')
printCodes(tree.left, prefix)
prefix.deleteCharAt(prefix.lastIndex)
//traverse right
prefix.append('1')
printCodes(tree.right, prefix)
prefix.deleteCharAt(prefix.lastIndex)
}
}
}
 
fun main(args: Array<String>) {
val test = "this is an example for huffman encoding"
 
val maxIndex = test.max()!!.toInt() + 1
val freqs = IntArray(maxIndex) //256 enough for latin ASCII table, but dynamic size is more fun
test.forEach { freqs[it.toInt()] += 1 }
 
val tree = buildTree(freqs)
println("SYMBOL\tWEIGHT\tHUFFMAN CODE")
printCodes(tree, StringBuffer())
}</syntaxhighlight>
 
{{out}}
<pre>SYMBOL WEIGHT HUFFMAN CODE
d 1 00000
t 1 00001
h 2 0001
s 2 0010
c 1 00110
x 1 00111
m 2 0100
o 2 0101
n 4 011
u 1 10000
l 1 10001
a 3 1001
r 1 10100
g 1 101010
p 1 101011
e 3 1011
i 3 1100
f 3 1101
6 111</pre>
 
=={{header|Lua}}==
This implementation proceeds in three steps: determine word frequencies,
construct the Huffman tree, and finally fold the tree into the codes
while outputting them.
<syntaxhighlight lang="lua">local build_freqtable = function (data)
local freq = { }
 
for i = 1, #data do
local cur = string.sub (data, i, i)
local count = freq [cur] or 0
freq [cur] = count + 1
end
 
local nodes = { }
for w, f in next, freq do
nodes [#nodes + 1] = { word = w, freq = f }
end
 
table.sort (nodes, function (a, b) return a.freq > b.freq end) --- reverse order!
 
return nodes
end
 
local build_hufftree = function (nodes)
while true do
local n = #nodes
local left = nodes [n]
nodes [n] = nil
 
local right = nodes [n - 1]
nodes [n - 1] = nil
 
local new = { freq = left.freq + right.freq, left = left, right = right }
 
if n == 2 then return new end
 
--- insert new node at correct priority
local prio = 1
while prio < #nodes and nodes [prio].freq > new.freq do
prio = prio + 1
end
table.insert (nodes, prio, new)
end
end
 
local print_huffcodes do
local rec_build_huffcodes
rec_build_huffcodes = function (node, bits, acc)
if node.word == nil then
rec_build_huffcodes (node.left, bits .. "0", acc)
rec_build_huffcodes (node.right, bits .. "1", acc)
return acc
else --- leaf
acc [#acc + 1] = { node.freq, node.word, bits }
end
return acc
end
 
print_huffcodes = function (root)
local codes = rec_build_huffcodes (root, "", { })
table.sort (codes, function (a, b) return a [1] < b [1] end)
print ("frequency\tword\thuffman code")
for i = 1, #codes do
print (string.format ("%9d\t‘%s’\t“%s”", table.unpack (codes [i])))
end
end
end
 
 
local huffcode = function (data)
local nodes = build_freqtable (data)
local huff = build_hufftree (nodes)
print_huffcodes (huff)
return 0
end
 
return huffcode "this is an example for huffman encoding"
 
</syntaxhighlight>
<pre>
frequency word huffman code
1 ‘g’ “01111”
1 ‘p’ “01011”
1 ‘d’ “01100”
1 ‘c’ “01101”
1 ‘t’ “01010”
1 ‘r’ “10000”
1 ‘u’ “11110”
1 ‘x’ “10001”
1 ‘l’ “01110”
2 ‘o’ “11111”
2 ‘m’ “0011”
2 ‘h’ “0010”
2 ‘s’ “0100”
3 ‘i’ “1101”
3 ‘f’ “1110”
3 ‘a’ “1100”
3 ‘e’ “1001”
4 ‘n’ “000”
6 ‘ ’ “101”
</pre>
 
=={{header|M2000 Interpreter}}==
<syntaxhighlight lang="m2000 interpreter">
Module Huffman {
comp=lambda (a, b) ->{
=array(a, 0)<array(b, 0)
}
module InsertPQ (a, n, &comp) {
if len(a)=0 then stack a {data n} : exit
if comp(n, stackitem(a)) then stack a {push n} : exit
stack a {
push n
t=2: b=len(a)
m=b
While t<=b {
t1=m
m=(b+t) div 2
if m=0 then m=t1 : exit
If comp(stackitem(m),n) then t=m+1: continue
b=m-1
m=b
}
if m>1 then shiftback m
}
}
a$="this is an example for huffman encoding"
inventory queue freq
For i=1 to len(a$) {
b$=mid$(a$,i,1)
if exist(freq, b$) then Return freq, b$:=freq(b$)+1 : continue
append freq, b$:=1
}
sort ascending freq
b=stack
K=each(freq)
LenA=len(a$)
While k {
InsertPQ b, (Round(Eval(k)/lenA, 4), eval$(k, k^)), &comp
}
While len(b)>1 {
Stack b {
Read m1, m2
InsertPQ b, (Array(m1)+Array(m2), (m1, m2) ), &comp
}
}
Print "Size of stack object (has only Root):"; len(b)
Print "Root probability:";Round(Array(Stackitem(b)), 3)
inventory encode, decode
 
Traverse(stackitem(b), "")
message$=""
For i=1 to len(a$)
message$+=encode$(mid$(a$, i, 1))
Next i
 
Print message$
j=1
check$=""
For i=1 to len(a$)
d=each(encode)
While d {
code$=eval$(d)
if mid$(message$, j, len(code$))=code$ then {
check$+=decode$(code$)
Print decode$(code$); : j+=len(code$)
}
}
Next i
Print
Print len(message$);" bits ", if$(a$=check$->"Encoding/decoding worked", "Encoding/Decoding failed")
Sub Traverse(a, a$)
local b=array(a,1)
if type$(b)="mArray" Else {
Print @(10); quote$(array$(a, 1));" "; a$,@(20),array(a)
Append decode, a$ :=array$(a, 1)
Append encode, array$(a, 1):=a$
Exit Sub
}
traverse(array(b), a$+"0")
traverse(array(b,1), a$+"1")
End Sub
}
Huffman
</syntaxhighlight>
 
{{out}}
<pre >
"p" 00000 0,0256
"l" 00001 0,0256
"t" 00010 0,0256
"r" 00011 0,0256
"x" 00100 0,0256
"u" 00101 0,0256
"s" 0011 0,0513
"o" 0100 0,0513
"m" 0101 0,0513
"n" 011 0,1026
"h" 1000 0,0513
"c" 10010 0,0256
"g" 100110 0,0256
"d" 100111 0,0256
"e" 1010 0,0769
"a" 1011 0,0769
"i" 1100 0,0769
"f" 1101 0,0769
" " 111 0,1538
0001010001100001111111000011111101101111110100010010110101000000000110101111101010000011111100000101110111010101101101111110100111001001001001111100011100110
this is an example for huffman encoding
157 bits Encoding/decoding worked
 
</pre >
 
=={{header|Mathematica}} / {{header|Wolfram Language}}==
 
<syntaxhighlight lang="mathematica">huffman[s_String] := huffman[Characters[s]];
huffman[l_List] := Module[{merge, structure, rules},
(*merge front two branches. list is assumed to be sorted*)
merge[k_] := Replace[k, {{a_, aC_}, {b_, bC_}, rest___} :> {{{a, b}, aC + bC}, rest}];
structure = FixedPoint[
Composition[merge, SortBy[#, Last] &],
Tally[l]][[1, 1]];
rules = (# -> Flatten[Position[structure, #] - 1]) & /@ DeleteDuplicates[l];
 
{Flatten[l /. rules], rules}];</syntaxhighlight>
 
=={{header|Nim}}==
 
<syntaxhighlight lang="nim">import tables, sequtils
 
type
 
# Following range can be changed to produce Huffman codes on arbitrary alphabet (e.g. ternary codes)
CodeSymbol = range[0..1]
 
HuffCode = seq[CodeSymbol]
 
Node = ref object
f: int
parent: Node
case isLeaf: bool
of true:
c: char
else:
childs: array[CodeSymbol, Node]
 
func `<`(a: Node, b: Node): bool =
# For min operator.
a.f < b.f
 
func `$`(hc: HuffCode): string =
result = ""
for symbol in hc:
result &= $symbol
 
func freeChildList(tree: seq[Node], parent: Node = nil): seq[Node] =
## Constructs a sequence of nodes which can be adopted
## Optional parent parameter can be set to ensure node will not adopt itself
for node in tree:
if node.parent.isNil and node != parent: result.add(node)
 
func connect(parent: Node, child: Node) =
# Only call this proc when sure that parent has a free child slot
child.parent = parent
parent.f += child.f
for i in parent.childs.low..parent.childs.high:
if parent.childs[i] == nil:
parent.childs[i] = child
return
 
func generateCodes(codes: TableRef[char, HuffCode],
currentNode: Node, currentCode: HuffCode = @[]) =
 
if currentNode.isLeaf:
let key = currentNode.c
codes[key] = currentCode
return
 
for i in currentNode.childs.low..currentNode.childs.high:
if not currentNode.childs[i].isNil:
let newCode = currentCode & i
generateCodes(codes, currentNode.childs[i], newCode)
 
 
func buildTree(frequencies: CountTable[char]): seq[Node] =
 
result = newSeq[Node](frequencies.len)
for i in result.low..result.high:
let key = toSeq(frequencies.keys)[i]
result[i] = Node(f: frequencies[key], isLeaf: true, c: key)
 
while result.freeChildList.len > 1:
let currentNode = new Node
result.add(currentNode)
for c in currentNode.childs:
currentNode.connect(min(result.freeChildList(currentNode)))
if result.freeChildList.len <= 1: break
 
when isMainModule:
 
import algorithm, strformat
 
const
SampleString = "this is an example for huffman encoding"
SampleFrequencies = SampleString.toCountTable()
 
func `<`(code1, code2: HuffCode): bool =
# Used to sort the result.
if code1.len == code2.len:
result = false
for (c1, c2) in zip(code1, code2):
if c1 != c2: return c1 < c2
else:
result = code1.len < code2.len
 
let
tree = buildTree(SampleFrequencies)
root = tree.freeChildList[0]
 
var huffCodes = newTable[char, HuffCode]()
generateCodes(huffCodes, root)
 
for (key, value) in sortedByIt(toSeq(huffCodes.pairs), it[1]):
echo &"'{key}' → {value}"</syntaxhighlight>
 
{{out}}
<pre>'n' → 000
' ' → 101
's' → 0010
'h' → 0011
'm' → 0100
'f' → 1001
'i' → 1100
'a' → 1101
'e' → 1110
'd' → 01010
'x' → 01011
'g' → 01100
'r' → 01101
'c' → 01110
'u' → 01111
't' → 10000
'p' → 10001
'l' → 11110
'o' → 11111</pre>
 
=={{header|Oberon-2}}==
{{works with|oo2c}}
<syntaxhighlight lang="oberon2">
MODULE HuffmanEncoding;
IMPORT
Object,
PriorityQueue,
Strings,
Out;
TYPE
Leaf = POINTER TO LeafDesc;
LeafDesc = RECORD
(Object.ObjectDesc)
c: CHAR;
END;
 
Inner = POINTER TO InnerDesc;
InnerDesc = RECORD
(Object.ObjectDesc)
left,right: Object.Object;
END;
VAR
str: ARRAY 128 OF CHAR;
i: INTEGER;
f: ARRAY 96 OF INTEGER;
q: PriorityQueue.Queue;
a: PriorityQueue.Node;
b: PriorityQueue.Node;
c: PriorityQueue.Node;
h: ARRAY 64 OF CHAR;
PROCEDURE NewLeaf(c: CHAR): Leaf;
VAR
x: Leaf;
BEGIN
NEW(x);x.c := c; RETURN x
END NewLeaf;
 
PROCEDURE NewInner(l,r: Object.Object): Inner;
VAR
x: Inner;
BEGIN
NEW(x); x.left := l; x.right := r; RETURN x
END NewInner;
 
 
PROCEDURE Preorder(n: Object.Object; VAR x: ARRAY OF CHAR);
BEGIN
IF n IS Leaf THEN
Out.Char(n(Leaf).c);Out.String(": ");Out.String(h);Out.Ln
ELSE
IF n(Inner).left # NIL THEN
Strings.Append("0",x);
Preorder(n(Inner).left,x);
Strings.Delete(x,(Strings.Length(x) - 1),1)
END;
IF n(Inner).right # NIL THEN
Strings.Append("1",x);
Preorder(n(Inner).right,x);
Strings.Delete(x,(Strings.Length(x) - 1),1)
END
END
END Preorder;
 
BEGIN
str := "this is an example for huffman encoding";
(* Collect letter frecuencies *)
i := 0;
WHILE str[i] # 0X DO INC(f[ORD(CAP(str[i])) - ORD(' ')]);INC(i) END;
(* Create Priority Queue *)
NEW(q);q.Clear();
(* Insert into the queue *)
i := 0;
WHILE (i < LEN(f)) DO
IF f[i] # 0 THEN
q.Insert(f[i]/Strings.Length(str),NewLeaf(CHR(i + ORD(' '))))
END;
INC(i)
END;
(* create tree *)
WHILE q.Length() > 1 DO
q.Remove(a);q.Remove(b);
q.Insert(a.w + b.w,NewInner(a.d,b.d));
END;
 
(* tree traversal *)
h[0] := 0X;q.Remove(c);Preorder(c.d,h);
 
END HuffmanEncoding.
</syntaxhighlight>
{{out}}
<pre>
D: 00000
T: 00001
H: 0001
S: 0010
C: 00110
X: 00111
M: 0100
O: 0101
N: 011
U: 10000
L: 10001
A: 1001
R: 10100
G: 101010
P: 101011
E: 1011
I: 1100
F: 1101
: 111
</pre>
 
=={{header|Objective-C}}==
Line 2,514 ⟶ 4,308:
 
This is not purely Objective-C. It uses Apple's Core Foundation library for its binary heap, which admittedly is very ugly. Thus, this only builds on Mac OS X, not GNUstep.
<langsyntaxhighlight lang="objc">#import <Foundation/Foundation.h>
 
 
Line 2,520 ⟶ 4,314:
int freq;
}
-(idinstancetype)initWithFreq:(int)f;
@property (nonatomic, readonly) int freq;
@end
 
@implementation HuffmanTree
@synthesize freq; // the frequency of this tree
-(idinstancetype)initWithFreq:(int)f {
if (self = [super init]) {
freq = f;
Line 2,536 ⟶ 4,330:
 
const void *HuffmanRetain(CFAllocatorRef allocator, const void *ptr) {
return [(__bridge_retained const void *)(__bridge id)ptr retain];
}
void HuffmanRelease(CFAllocatorRef allocator, const void *ptr) {
[(void)(__bridge_transfer id)ptr release];
}
CFComparisonResult HuffmanCompare(const void *ptr1, const void *ptr2, void *unused) {
int f1 = ((__bridge HuffmanTree *)ptr1).freq;
int f2 = ((__bridge HuffmanTree *)ptr2).freq;
if (f1 == f2)
return kCFCompareEqualTo;
Line 2,557 ⟶ 4,351:
}
@property (readonly) char value;
-(idinstancetype)initWithFreq:(int)f character:(char)c;
@end
 
@implementation HuffmanLeaf
@synthesize value;
-(idinstancetype)initWithFreq:(int)f character:(char)c {
if (self = [super initWithFreq:f]) {
value = c;
Line 2,575 ⟶ 4,369:
}
@property (readonly) HuffmanTree *left, *right;
-(idinstancetype)initWithLeft:(HuffmanTree *)l right:(HuffmanTree *)r;
@end
 
@implementation HuffmanNode
@synthesize left, right;
-(idinstancetype)initWithLeft:(HuffmanTree *)l right:(HuffmanTree *)r {
if (self = [super initWithFreq:l.freq+r.freq]) {
left = [l retain];
right = [r retain];
}
return self;
}
-(void)dealloc {
[left release];
[right release];
[super dealloc];
}
@end
Line 2,605 ⟶ 4,394:
int freq = [chars countForObject:ch];
if (freq > 0)
CFBinaryHeapAddValue(trees, [(__bridge const void *)[[HuffmanLeaf alloc] initWithFreq:freq character:(char)[ch intValue]] autorelease]);
}
Line 2,612 ⟶ 4,401:
while (CFBinaryHeapGetCount(trees) > 1) {
// two trees with least frequency
HuffmanTree *a = (__bridge HuffmanTree *)CFBinaryHeapGetMinimum(trees);
CFBinaryHeapRemoveMinimumValue(trees);
HuffmanTree *b = (__bridge HuffmanTree *)CFBinaryHeapGetMinimum(trees);
CFBinaryHeapRemoveMinimumValue(trees);
// put into new node and re-insert into queue
CFBinaryHeapAddValue(trees, [(__bridge const void *)[[HuffmanNode alloc] initWithLeft:a right:b] autorelease]);
}
HuffmanTree *result = [(__bridge HuffmanTree *)CFBinaryHeapGetMinimum(trees) retain];
CFRelease(trees);
return [result autorelease];
}
 
Line 2,649 ⟶ 4,438:
 
int main(int argc, const char * argv[]) {
@autoreleasepool {
NSAutoreleasePool * pool = [[NSAutoreleasePool alloc] init];
 
NSString *test = @"this is an example for huffman encoding";
Line 2,657 ⟶ 4,446:
int n = [test length];
for (int i = 0; i < n; i++)
[chars addObject:[NSNumber numberWithInt:@([test characterAtIndex:i]])];
// build tree
HuffmanTree *tree = buildTree(chars);
[chars release];
// print out results
Line 2,667 ⟶ 4,455:
printCodes(tree, [NSMutableString string]);
[pool drain];}
return 0;
}</langsyntaxhighlight>
 
{{out}}
Example output:
<pre>
SYMBOL WEIGHT HUFFMAN CODE
Line 2,699 ⟶ 4,487:
 
We use a Set (which is automatically sorted) as a priority queue.
{{works with|OCaml|4.02+}}
<lang ocaml>type 'a huffman_tree =
<syntaxhighlight lang="ocaml">type 'a huffman_tree =
| Leaf of 'a
| Node of 'a huffman_tree * 'a huffman_tree
Line 2,714 ⟶ 4,503:
 
let build_tree charFreqs =
let leaves = HSet.of_list (List.fold_leftmap (fun z (c,f) -> HSet.add (f, Leaf c) z) HSet.empty charFreqs) in
let rec aux trees =
let f1, a = HSet.min_elt trees in
Line 2,748 ⟶ 4,537:
let tree = build_tree charFreqs in
print_string "Symbol\tHuffman code\n";
print_tree [] tree</langsyntaxhighlight>
 
=={{header|Ol}}==
<syntaxhighlight lang="scheme">
(define phrase "this is an example for huffman encoding")
 
; prepare initial probabilities table
(define table (ff->list
(fold (lambda (ff x)
(put ff x (+ (ff x 0) 1)))
{}
(string->runes phrase))))
 
; just sorter...
(define (resort l)
(sort (lambda (x y) (< (cdr x) (cdr y))) l))
; ...to sort table
(define table (resort table))
 
; build huffman tree
(define tree
(let loop ((table table))
(if (null? (cdr table))
(car table)
(loop (resort (cons
(cons
{ 1 (car table) 0 (cadr table)}
(+ (cdar table) (cdadr table)))
(cddr table)))))))
 
; huffman codes
(define codes
(map (lambda (i)
(call/cc (lambda (return)
(let loop ((prefix #null) (tree tree))
(if (ff? (car tree))
(begin
(loop (cons 0 prefix) ((car tree) 0))
(loop (cons 1 prefix) ((car tree) 1)))
(if (eq? (car tree) i)
(return (reverse prefix))))))))
(map car table)))
</syntaxhighlight>
{{Out}}
<syntaxhighlight lang="scheme">
(print "weights: ---------------------------")
(for-each (lambda (ch)
(print (string (car ch)) ": " (cdr ch)))
(reverse table))
 
(print "codes: -----------------------------")
(map (lambda (char code)
(print (string char) ": " code))
(reverse (map car table))
(reverse codes))
</syntaxhighlight>
<pre>
weights: ---------------------------
: 6
n: 4
i: 3
f: 3
e: 3
a: 3
s: 2
o: 2
m: 2
h: 2
x: 1
u: 1
t: 1
r: 1
p: 1
l: 1
g: 1
d: 1
c: 1
codes: -----------------------------
: (0 0 0)
n: (1 1 0)
i: (0 1 0 0)
f: (0 1 0 1)
e: (0 0 1 0)
a: (0 0 1 1)
s: (0 1 1 1)
o: (1 0 1 0)
m: (1 0 1 1)
h: (1 0 0 0)
x: (0 1 1 0 1)
u: (0 1 1 0 0 0)
t: (0 1 1 0 0 1)
r: (1 1 1 1 0)
p: (1 1 1 1 1)
l: (1 1 1 0 0)
g: (1 1 1 0 1)
d: (1 0 0 1 0)
c: (1 0 0 1 1)
</pre>
 
=={{header|Perl}}==
<syntaxhighlight lang ="perl">sub make_treeuse {5.10.0;
use strict;
my %letters;
$letters{$_}++ for (split "", shift);
my (@nodes, $n) = map({ a=>$_, freq=>$letters{$_} }, keys %letters);
while ((@nodes = sort { $a->{freq} <=> $b->{freq}
or $a->{a} cmp $b->{a} } @nodes) > 1)
{
$n = { "0"=>shift(@nodes), "1"=>shift(@nodes) };
$n->{freq} = $n->{0}{freq} + $n->{1}{freq};
push @nodes, $n;
}
 
# produce encode and decode dictionary from a tree
walk($n, "", $n->{tree} = {});
sub walk {
$n;
my ($node, $code, $h, $rev_h) = @_;
 
my $c = $node->[0];
if (ref $c) { walk($c->[$_], $code.$_, $h, $rev_h) for 0,1 }
else { $h->{$c} = $code; $rev_h->{$code} = $c }
 
$h, $rev_h
}
 
# make a tree, and return resulting dictionaries
sub walk {
sub mktree {
my ($n, $s, $h) = @_;
my (%freq, @nodes);
exists $n->{a} and do {
$freq{$_}++ for split '', shift;
print "'$n->{a}': $s\n";
@nodes = map([$_, $freq{$_}], keys %freq);
$h->{$n->{a}} = $s if $h;
 
return;
do { # poor man's priority queue
};
@nodes = sort {$a->[1] <=> $b->[1]} @nodes;
walk($n->{0}, $s.0, $h);
my ($x, $y) = splice @nodes, 0, 2;
walk($n->{1}, $s.1, $h);
push @nodes, [[$x, $y], $x->[1] + $y->[1]]
} while (@nodes > 1);
 
walk($nodes[0], '', {}, {})
}
 
sub encode {
my ($sstr, $tdict) = @_;
join '', map $dict->{$_}//die("bad char $_"), split '', $str
$t = $t->{tree};
join("", map($t->{$_}, split("", $s)));
}
 
sub decode {
my @b = split(""$str, shift$dict) = @_;
my ($nseg, $@out) = $_[0]("");
 
# append to current segment until it's in the dictionary
while (@b) {
for (split '', $str) {
$n = $n->{shift @b};
$seg .= $_;
if ($n->{a}) {
my $outx .= $ndict->{a$seg} // next;
$npush =@out, $_[0]x;
$seg = '';
}
}
die "bad code" if length($seg);
$out;
join '', @out
}
 
my $texttxt = "'this is an example for huffman encoding"';
my ($h, $treerev_h) = make_treemktree($texttxt);
for (keys %$h) { print "'$_': $h->{$_}\n" }
my $e = encode($text, $tree);
print "$e\n";
print decode($e, $tree), "\n";</lang>output<lang>'g': 00000
'l': 00001
'p': 00010
'r': 00011
't': 00100
'u': 00101
'h': 0011
'm': 0100
'o': 0101
'n': 011
's': 1000
'x': 10010
'c': 100110
'd': 100111
'a': 1010
'e': 1011
'f': 1100
'i': 1101
' ': 111
0010000111101100011111...111110101100000
this is an example for huffman encoding</lang>
=={{header|Perl 6}}==
<lang perl6>sub huffman ($s) {
my $de = $s.chars;
my @q = $s.comb.classify({$_}).map({[+.value / $de, .key]}).sort;
while @q > 1 {
my ($a,$b) = @q.splice(0,2);
@q = sort [$a[0] + $b[0], [$a[1], $b[1]]], @q;
}
sort *.value, gather walk @q[0][1], '';
}
 
my $enc = encode($txt, $h);
multi walk (@node, $prefix) {
print "$enc\n";
walk @node[0], $prefix ~ 1;
walk @node[1], $prefix ~ 0;
}
multi walk ($node, $prefix) { take $node => $prefix }
 
print decode($enc, $rev_h), "\n";</syntaxhighlight>
say .perl for huffman('this is an example for huffman encoding');</lang>
{{out}}
Output:
<pre>"d" => "000000"
'u': 10000
"c" => "000001"
'd': 01111
"x" => "00001"
'a': 1101
"i" => "0001"
'l': 10001
"f" => "0010"
'i': 1110
"e" => "0011"
'g': 11110
" " => "010"
'h': 0100
"a" => "0110"
"u"'r': => "01110"
' ': 101
"t" => "01111"
'p': 01100
"s" => "1000"
't': 01101
"r" => "10010"
'n': 000
"p" => "10011"
'm': 0011
"n" => "101"
'x': 01011
"o" => "1100"
'f': 1100
"m" => "1101"
'c': 01010
"h" => "1110"
'o': 0010
"l" => "11110"
"g"'s': => "11111"</pre>
'e': 1001
To demonstrate that the table can do a round trip:
0110101001110111111011110111111011101000101100101011110100110110010001100110111000010011101010100100001100110000111101000101100100001010001001111111000011110
<lang perl6>my $str = 'this is an example for huffman encoding';
mythis %encis =an example for huffman $str;encoding
</pre>
my %dec = %enc.invert;
say $str;
my $huf = %enc{$str.comb}.join;
say $huf;
my $rx = join('|', map { "'" ~ .key ~ "'" }, %dec);
$rx = eval '/' ~ $rx ~ '/';
say $huf.subst(/<$rx>/, -> $/ {%dec{~$/}}, :g);</lang>
Output:
<pre>this is an example for huffman encoding
0111111100001100001000011000010011010101000110000101101101100111111000110100010110010010010111001110001000101101011010101000111010000011100000000000110111111
this is an example for huffman encoding</pre>
 
=={{header|PicoLispPhix}}==
{{trans|Lua}}
Using a cons cells (freq . char) for leaves, and two cells (freq left . right)
<!--<syntaxhighlight lang="phix">(phixonline)-->
for nodes.
<span style="color: #008080;">with</span> <span style="color: #008080;">javascript_semantics</span>
<lang PicoLisp>(de prio (Idx)
<span style="color: #008080;">function</span> <span style="color: #000000;">store_nodes</span><span style="color: #0000FF;">(</span><span style="color: #004080;">object</span> <span style="color: #000000;">key</span><span style="color: #0000FF;">,</span> <span style="color: #004080;">object</span> <span style="color: #000000;">data</span><span style="color: #0000FF;">,</span> <span style="color: #004080;">integer</span> <span style="color: #000000;">nodes</span><span style="color: #0000FF;">)</span>
(while (cadr Idx) (setq Idx @))
<span style="color: #7060A8;">setd</span><span style="color: #0000FF;">({</span><span style="color: #000000;">data</span><span style="color: #0000FF;">,</span><span style="color: #000000;">key</span><span style="color: #0000FF;">},</span><span style="color: #000000;">0</span><span style="color: #0000FF;">,</span><span style="color: #000000;">nodes</span><span style="color: #0000FF;">)</span>
(car Idx) )
<span style="color: #008080;">return</span> <span style="color: #000000;">1</span>
 
<span style="color: #008080;">end</span> <span style="color: #008080;">function</span>
(let (A NIL P NIL L NIL)
(for C (chop "this is an example for huffman encoding")
<span style="color: #008080;">function</span> <span style="color: #000000;">build_freqtable</span><span style="color: #0000FF;">(</span><span style="color: #004080;">string</span> <span style="color: #000000;">data</span><span style="color: #0000FF;">)</span>
(accu 'A C 1) ) # Count characters
<span style="color: #004080;">integer</span> <span style="color: #000000;">freq</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">new_dict</span><span style="color: #0000FF;">(),</span>
(for X A # Build index tree as priority queue
<span style="color: #000000;">nodes</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">new_dict</span><span style="color: #0000FF;">()</span>
(idx 'P (cons (cdr X) (car X)) T) )
<span style="color: #008080;">for</span> <span style="color: #000000;">i</span><span style="color: #0000FF;">=</span><span style="color: #000000;">1</span> <span style="color: #008080;">to</span> <span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">data</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">do</span>
(while (or (cadr P) (cddr P)) # Remove entries, insert as nodes
<span style="color: #004080;">integer</span> <span style="color: #000000;">di</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">data</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">]</span>
(let (A (car (idx 'P (prio P) NIL)) B (car (idx 'P (prio P) NIL)))
<span style="color: #7060A8;">setd</span><span style="color: #0000FF;">(</span><span style="color: #000000;">di</span><span style="color: #0000FF;">,</span><span style="color: #7060A8;">getd</span><span style="color: #0000FF;">(</span><span style="color: #000000;">di</span><span style="color: #0000FF;">,</span><span style="color: #000000;">freq</span><span style="color: #0000FF;">)+</span><span style="color: #000000;">1</span><span style="color: #0000FF;">,</span><span style="color: #000000;">freq</span><span style="color: #0000FF;">)</span>
(idx 'P (cons (+ (car A) (car B)) A B) T) ) )
<span style="color: #008080;">end</span> <span style="color: #008080;">for</span>
(setq P (car P))
<span style="color: #7060A8;">traverse_dict</span><span style="color: #0000FF;">(</span><span style="color: #000000;">store_nodes</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">nodes</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">freq</span><span style="color: #0000FF;">)</span>
(recur (P L) # Traverse and print
<span style="color: #7060A8;">destroy_dict</span><span style="color: #0000FF;">(</span><span style="color: #000000;">freq</span><span style="color: #0000FF;">)</span>
(if (atom (cdr P))
<span style="color: #008080;">return</span> <span style="color: #000000;">nodes</span>
(prinl (cdr P) " " L)
<span style="color: #008080;">end</span> <span style="color: #008080;">function</span>
(recurse (cadr P) (cons 0 L))
(recurse (cddr P) (cons 1 L)) ) ) )</lang>
<span style="color: #008080;">function</span> <span style="color: #000000;">build_hufftree</span><span style="color: #0000FF;">(</span><span style="color: #004080;">integer</span> <span style="color: #000000;">nodes</span><span style="color: #0000FF;">)</span>
Output:
<span style="color: #004080;">sequence</span> <span style="color: #000000;">node</span>
<pre>n 000
<span style="color: #008080;">while</span> <span style="color: #004600;">true</span> <span style="color: #008080;">do</span>
m 0100
<span style="color: #004080;">sequence</span> <span style="color: #000000;">lkey</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">getd_partial_key</span><span style="color: #0000FF;">({</span><span style="color: #000000;">0</span><span style="color: #0000FF;">,</span><span style="color: #000000;">0</span><span style="color: #0000FF;">},</span><span style="color: #000000;">nodes</span><span style="color: #0000FF;">)</span>
o 1100
<span style="color: #004080;">integer</span> <span style="color: #000000;">lfreq</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">lkey</span><span style="color: #0000FF;">[</span><span style="color: #000000;">1</span><span style="color: #0000FF;">]</span>
s 0010
<span style="color: #7060A8;">deld</span><span style="color: #0000FF;">(</span><span style="color: #000000;">lkey</span><span style="color: #0000FF;">,</span><span style="color: #000000;">nodes</span><span style="color: #0000FF;">)</span>
c 01010
<span style="color: #004080;">sequence</span> <span style="color: #000000;">rkey</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">getd_partial_key</span><span style="color: #0000FF;">({</span><span style="color: #000000;">0</span><span style="color: #0000FF;">,</span><span style="color: #000000;">0</span><span style="color: #0000FF;">},</span><span style="color: #000000;">nodes</span><span style="color: #0000FF;">)</span>
d 11010
<span style="color: #004080;">integer</span> <span style="color: #000000;">rfreq</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">rkey</span><span style="color: #0000FF;">[</span><span style="color: #000000;">1</span><span style="color: #0000FF;">]</span>
g 00110
<span style="color: #7060A8;">deld</span><span style="color: #0000FF;">(</span><span style="color: #000000;">rkey</span><span style="color: #0000FF;">,</span><span style="color: #000000;">nodes</span><span style="color: #0000FF;">)</span>
l 10110
p 01110
<span style="color: #000000;">node</span> <span style="color: #0000FF;">=</span> <span style="color: #0000FF;">{</span><span style="color: #000000;">lfreq</span><span style="color: #0000FF;">+</span><span style="color: #000000;">rfreq</span><span style="color: #0000FF;">,{</span><span style="color: #000000;">lkey</span><span style="color: #0000FF;">,</span><span style="color: #000000;">rkey</span><span style="color: #0000FF;">}}</span>
r 11110
t 00001
<span style="color: #008080;">if</span> <span style="color: #7060A8;">dict_size</span><span style="color: #0000FF;">(</span><span style="color: #000000;">nodes</span><span style="color: #0000FF;">)=</span><span style="color: #000000;">0</span> <span style="color: #008080;">then</span> <span style="color: #008080;">exit</span> <span style="color: #008080;">end</span> <span style="color: #008080;">if</span>
u 10001
a 1001
<span style="color: #7060A8;">setd</span><span style="color: #0000FF;">(</span><span style="color: #000000;">node</span><span style="color: #0000FF;">,</span><span style="color: #000000;">0</span><span style="color: #0000FF;">,</span><span style="color: #000000;">nodes</span><span style="color: #0000FF;">)</span>
101
<span style="color: #008080;">end</span> <span style="color: #008080;">while</span>
e 0011
<span style="color: #7060A8;">destroy_dict</span><span style="color: #0000FF;">(</span><span style="color: #000000;">nodes</span><span style="color: #0000FF;">)</span>
f 1011
<span style="color: #008080;">return</span> <span style="color: #000000;">node</span>
i 0111
<span style="color: #008080;">end</span> <span style="color: #008080;">function</span>
x 01111
h 11111</pre>
<span style="color: #008080;">procedure</span> <span style="color: #000000;">build_huffcodes</span><span style="color: #0000FF;">(</span><span style="color: #004080;">object</span> <span style="color: #000000;">node</span><span style="color: #0000FF;">,</span> <span style="color: #004080;">string</span> <span style="color: #000000;">bits</span><span style="color: #0000FF;">,</span> <span style="color: #004080;">integer</span> <span style="color: #000000;">d</span><span style="color: #0000FF;">)</span>
<span style="color: #0000FF;">{</span><span style="color: #004080;">integer</span> <span style="color: #000000;">freq</span><span style="color: #0000FF;">,</span> <span style="color: #004080;">object</span> <span style="color: #000000;">data</span><span style="color: #0000FF;">}</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">node</span>
<span style="color: #008080;">if</span> <span style="color: #004080;">sequence</span><span style="color: #0000FF;">(</span><span style="color: #000000;">data</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">then</span>
<span style="color: #000000;">build_huffcodes</span><span style="color: #0000FF;">(</span><span style="color: #000000;">data</span><span style="color: #0000FF;">[</span><span style="color: #000000;">1</span><span style="color: #0000FF;">],</span><span style="color: #000000;">bits</span><span style="color: #0000FF;">&</span><span style="color: #008000;">'0'</span><span style="color: #0000FF;">,</span><span style="color: #000000;">d</span><span style="color: #0000FF;">)</span>
<span style="color: #000000;">build_huffcodes</span><span style="color: #0000FF;">(</span><span style="color: #000000;">data</span><span style="color: #0000FF;">[</span><span style="color: #000000;">2</span><span style="color: #0000FF;">],</span><span style="color: #000000;">bits</span><span style="color: #0000FF;">&</span><span style="color: #008000;">'1'</span><span style="color: #0000FF;">,</span><span style="color: #000000;">d</span><span style="color: #0000FF;">)</span>
<span style="color: #008080;">else</span>
<span style="color: #7060A8;">setd</span><span style="color: #0000FF;">(</span><span style="color: #000000;">data</span><span style="color: #0000FF;">,{</span><span style="color: #000000;">freq</span><span style="color: #0000FF;">,</span><span style="color: #000000;">bits</span><span style="color: #0000FF;">},</span><span style="color: #000000;">d</span><span style="color: #0000FF;">)</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">if</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">procedure</span>
<span style="color: #008080;">function</span> <span style="color: #000000;">print_huffcode</span><span style="color: #0000FF;">(</span><span style="color: #004080;">integer</span> <span style="color: #000000;">key</span><span style="color: #0000FF;">,</span> <span style="color: #004080;">sequence</span> <span style="color: #000000;">data</span><span style="color: #0000FF;">,</span> <span style="color: #004080;">integer</span> <span style="color: #000080;font-style:italic;">/*user_data*/</span><span style="color: #0000FF;">)</span>
<span style="color: #0000FF;">{</span><span style="color: #004080;">integer</span> <span style="color: #000000;">i</span><span style="color: #0000FF;">,</span> <span style="color: #004080;">string</span> <span style="color: #000000;">s</span><span style="color: #0000FF;">}</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">data</span>
<span style="color: #7060A8;">printf</span><span style="color: #0000FF;">(</span><span style="color: #000000;">1</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"'%c' [%d] %s\n"</span><span style="color: #0000FF;">,{</span><span style="color: #000000;">key</span><span style="color: #0000FF;">,</span><span style="color: #000000;">i</span><span style="color: #0000FF;">,</span><span style="color: #000000;">s</span><span style="color: #0000FF;">})</span>
<span style="color: #008080;">return</span> <span style="color: #000000;">1</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">function</span>
<span style="color: #008080;">procedure</span> <span style="color: #000000;">print_huffcodes</span><span style="color: #0000FF;">(</span><span style="color: #004080;">integer</span> <span style="color: #000000;">d</span><span style="color: #0000FF;">)</span>
<span style="color: #7060A8;">traverse_dict</span><span style="color: #0000FF;">(</span><span style="color: #000000;">print_huffcode</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">0</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">d</span><span style="color: #0000FF;">)</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">procedure</span>
<span style="color: #008080;">function</span> <span style="color: #000000;">invert_huffcode</span><span style="color: #0000FF;">(</span><span style="color: #004080;">integer</span> <span style="color: #000000;">key</span><span style="color: #0000FF;">,</span> <span style="color: #004080;">sequence</span> <span style="color: #000000;">data</span><span style="color: #0000FF;">,</span> <span style="color: #004080;">integer</span> <span style="color: #000000;">rd</span><span style="color: #0000FF;">)</span>
<span style="color: #7060A8;">setd</span><span style="color: #0000FF;">(</span><span style="color: #000000;">data</span><span style="color: #0000FF;">[</span><span style="color: #000000;">2</span><span style="color: #0000FF;">],</span><span style="color: #000000;">key</span><span style="color: #0000FF;">,</span><span style="color: #000000;">rd</span><span style="color: #0000FF;">)</span>
<span style="color: #008080;">return</span> <span style="color: #000000;">1</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">function</span>
<span style="color: #008080;">procedure</span> <span style="color: #000000;">main</span><span style="color: #0000FF;">(</span><span style="color: #004080;">string</span> <span style="color: #000000;">data</span><span style="color: #0000FF;">)</span>
<span style="color: #008080;">if</span> <span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">data</span><span style="color: #0000FF;">)<</span><span style="color: #000000;">2</span> <span style="color: #008080;">then</span> <span style="color: #0000FF;">?</span><span style="color: #000000;">9</span><span style="color: #0000FF;">/</span><span style="color: #000000;">0</span> <span style="color: #008080;">end</span> <span style="color: #008080;">if</span>
<span style="color: #004080;">integer</span> <span style="color: #000000;">nodes</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">build_freqtable</span><span style="color: #0000FF;">(</span><span style="color: #000000;">data</span><span style="color: #0000FF;">)</span>
<span style="color: #004080;">sequence</span> <span style="color: #000000;">huff</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">build_hufftree</span><span style="color: #0000FF;">(</span><span style="color: #000000;">nodes</span><span style="color: #0000FF;">)</span>
<span style="color: #004080;">integer</span> <span style="color: #000000;">d</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">new_dict</span><span style="color: #0000FF;">()</span>
<span style="color: #000000;">build_huffcodes</span><span style="color: #0000FF;">(</span><span style="color: #000000;">huff</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">""</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">d</span><span style="color: #0000FF;">)</span>
<span style="color: #000000;">print_huffcodes</span><span style="color: #0000FF;">(</span><span style="color: #000000;">d</span><span style="color: #0000FF;">)</span>
<span style="color: #004080;">string</span> <span style="color: #000000;">encoded</span> <span style="color: #0000FF;">=</span> <span style="color: #008000;">""</span>
<span style="color: #008080;">for</span> <span style="color: #000000;">i</span><span style="color: #0000FF;">=</span><span style="color: #000000;">1</span> <span style="color: #008080;">to</span> <span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">data</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">do</span>
<span style="color: #000000;">encoded</span> <span style="color: #0000FF;">&=</span> <span style="color: #7060A8;">getd</span><span style="color: #0000FF;">(</span><span style="color: #000000;">data</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">],</span><span style="color: #000000;">d</span><span style="color: #0000FF;">)[</span><span style="color: #000000;">2</span><span style="color: #0000FF;">]</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">for</span>
<span style="color: #0000FF;">?</span><span style="color: #7060A8;">shorten</span><span style="color: #0000FF;">(</span><span style="color: #000000;">encoded</span><span style="color: #0000FF;">)</span>
<span style="color: #004080;">integer</span> <span style="color: #000000;">rd</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">new_dict</span><span style="color: #0000FF;">()</span>
<span style="color: #7060A8;">traverse_dict</span><span style="color: #0000FF;">(</span><span style="color: #000000;">invert_huffcode</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">rd</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">d</span><span style="color: #0000FF;">)</span>
<span style="color: #004080;">string</span> <span style="color: #000000;">decoded</span> <span style="color: #0000FF;">=</span> <span style="color: #008000;">""</span>
<span style="color: #004080;">integer</span> <span style="color: #000000;">done</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">0</span>
<span style="color: #008080;">while</span> <span style="color: #000000;">done</span><span style="color: #0000FF;"><</span><span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">encoded</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">do</span>
<span style="color: #004080;">string</span> <span style="color: #000000;">key</span> <span style="color: #0000FF;">=</span> <span style="color: #008000;">""</span>
<span style="color: #004080;">integer</span> <span style="color: #000000;">node</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">0</span>
<span style="color: #008080;">while</span> <span style="color: #000000;">node</span><span style="color: #0000FF;">=</span><span style="color: #000000;">0</span> <span style="color: #008080;">do</span>
<span style="color: #000000;">done</span> <span style="color: #0000FF;">+=</span> <span style="color: #000000;">1</span>
<span style="color: #000000;">key</span> <span style="color: #0000FF;">&=</span> <span style="color: #000000;">encoded</span><span style="color: #0000FF;">[</span><span style="color: #000000;">done</span><span style="color: #0000FF;">]</span>
<span style="color: #000000;">node</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">getd_index</span><span style="color: #0000FF;">(</span><span style="color: #000000;">key</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">rd</span><span style="color: #0000FF;">)</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">while</span>
<span style="color: #000000;">decoded</span> <span style="color: #0000FF;">&=</span> <span style="color: #7060A8;">getd_by_index</span><span style="color: #0000FF;">(</span><span style="color: #000000;">node</span><span style="color: #0000FF;">,</span><span style="color: #000000;">rd</span><span style="color: #0000FF;">)</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">while</span>
<span style="color: #0000FF;">?</span><span style="color: #000000;">decoded</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">procedure</span>
<span style="color: #000000;">main</span><span style="color: #0000FF;">(</span><span style="color: #008000;">"this is an example for huffman encoding"</span><span style="color: #0000FF;">)</span>
<!--</syntaxhighlight>-->
{{out}}
<pre>
' ' [6] 101
'a' [3] 1001
'c' [1] 01010
'd' [1] 01011
'e' [3] 1100
'f' [3] 1101
'g' [1] 01100
'h' [2] 11111
'i' [3] 1110
'l' [1] 01101
'm' [2] 0010
'n' [4] 000
'o' [2] 0011
'p' [1] 01110
'r' [1] 01111
's' [2] 0100
't' [1] 10000
'u' [1] 10001
'x' [1] 11110
"10000111111110010010...01101011111000001100 (157 digits)"
"this is an example for huffman encoding"
</pre>
 
=={{header|PHP}}==
Line 2,921 ⟶ 4,848:
{{trans|Python}} (not exactly)
 
<langsyntaxhighlight lang="php"><?php
function encode($symb2freq) {
$heap = new SplPriorityQueue;
Line 2,948 ⟶ 4,875:
foreach ($huff as $sym => $code)
echo "$sym\t$symb2freq[$sym]\t$code\n";
?></langsyntaxhighlight>
 
{{out}}
Example output:
<pre>
Symbol Weight Huffman Code
Line 2,972 ⟶ 4,899:
h 2 11101
e 3 1111
</pre>
 
=={{header|Picat}}==
{{trans|Prolog}}
<syntaxhighlight lang="picat">go =>
huffman("this is an example for huffman encoding").
 
huffman(LA) :-
LS=sort(LA),
packList(LS,PL),
PLS=sort(PL).remove_dups(),
build_tree(PLS, A),
coding(A, [], C),
SC=sort(C).remove_dups(),
println("Symbol\tWeight\tCode"),
foreach(SS in SC) print_code(SS) end.
build_tree([[V1|R1], [V2|R2]|T], AF) :-
V = V1 + V2,
A = [V, [V1|R1], [V2|R2]],
( T=[] -> AF=A ; NT=sort([A|T]), build_tree(NT, AF) ).
coding([_A,FG,FD], Code, CF) :-
( is_node(FG) -> coding(FG, [0 | Code], C1)
; leaf_coding(FG, [0 | Code], C1) ),
( is_node(FD) -> coding(FD, [1 | Code], C2)
; leaf_coding(FD, [1 | Code], C2) ),
append(C1, C2, CF).
leaf_coding([FG,FD], Code, CF) :-
CodeR = reverse(Code),
CF = [[FG, FD, CodeR]] .
is_node([_V, _FG, _FD]).
print_code([N, Car, Code]) :-
printf("%w:\t%w\t", Car, N),
foreach(V in Code) print(V) end,
nl.
 
packList([], []).
packList([X],[[1,X]]).
packList([X|Rest], XRunPacked) :-
XRunPacked = [XRun|Packed],
run(X, Rest, XRun, RRest),
packList(RRest, Packed).
run(V, [], VV, []) :- VV=[1,V].
run(V, [V|LRest], [N1,V], RRest) :-
run(V, LRest, [N, V], RRest),
N1 = N + 1.
run(V, [Other|RRest], [1,V], [Other|RRest]) :-
different_terms(V, Other).</syntaxhighlight>
 
{{out}}
<pre>Symbol Weight Code
c: 1 01010
d: 1 01011
g: 1 01100
l: 1 01101
p: 1 01110
r: 1 01111
t: 1 10000
u: 1 10001
x: 1 11110
h: 2 11111
m: 2 0010
o: 2 0011
s: 2 0100
a: 3 1001
e: 3 1100
f: 3 1101
i: 3 1110
n: 4 000
: 6 101</pre>
 
=={{header|PicoLisp}}==
Using a cons cells (freq . char) for leaves, and two cells (freq left . right)
for nodes.
<syntaxhighlight lang="picolisp">(de prio (Idx)
(while (cadr Idx) (setq Idx @))
(car Idx) )
 
(let (A NIL P NIL L NIL)
(for C (chop "this is an example for huffman encoding")
(accu 'A C 1) ) # Count characters
(for X A # Build index tree as priority queue
(idx 'P (cons (cdr X) (car X)) T) )
(while (or (cadr P) (cddr P)) # Remove entries, insert as nodes
(let (A (car (idx 'P (prio P) NIL)) B (car (idx 'P (prio P) NIL)))
(idx 'P (cons (+ (car A) (car B)) A B) T) ) )
(setq P (car P))
(recur (P L) # Traverse and print
(if (atom (cdr P))
(prinl (cdr P) " " L)
(recurse (cadr P) (cons 0 L))
(recurse (cddr P) (cons 1 L)) ) ) )</syntaxhighlight>
{{out}}
<pre>n 000
m 0100
o 1100
s 0010
c 01010
d 11010
g 00110
l 10110
p 01110
r 11110
t 00001
u 10001
a 1001
101
e 0011
f 1011
i 0111
x 01111
h 11111</pre>
 
=={{header|PL/I}}==
<syntaxhighlight lang="pli">*process source attributes xref or(!);
hencode: Proc Options(main);
/*--------------------------------------------------------------------
* 28.12.013 Walter Pachl translated from REXX
*-------------------------------------------------------------------*/
Dcl debug Bit(1) Init('0'b);
Dcl (i,j,k) Bin Fixed(15);
Dcl c Char(1);
Dcl s Char(100) Var Init('this is an example for huffman encoding');
Dcl sc Char(1000) Var Init('');
Dcl sr Char(100) Var Init('');
Dcl 1 cocc(100),
2 c Char(1),
2 occ Bin Fixed(31);
Dcl cocc_n Bin Fixed(15) Init(0);
dcl 1 node,
2 id Bin Fixed(15), /* Node id */
2 c Char(1), /* character */
2 occ Bin Fixed(15), /* number of occurrences */
2 left Bin Fixed(15), /* left child */
2 rite Bin Fixed(15), /* right child */
2 father Bin Fixed(15), /* father */
2 digit Pic'9', /* digit (0 or 1) */
2 term Pic'9'; /* 1=terminal node */
node='';
Dcl 1 m(100) Like node;
Dcl m_n Bin Fixed(15) Init(0);
Dcl father(100) Bin Fixed(15);
 
Dcl 1 t(100),
2 char Char(1),
2 code Char(20) Var;
Dcl t_n Bin Fixed(15) Init(0);
 
Do i=1 To length(s); /* first collect used characters */
c=substr(s,i,1); /* and number of occurrences */
Do j=1 To cocc_n;
If cocc(j).c=c Then Leave;
End;
If j<= cocc_n Then
cocc(j).occ+=1;
Else Do;
cocc(j).c=c;
cocc(j).occ=1;
cocc_n+=1;
End;
End;
 
Do j=1 To cocc_n; /* create initial node list */
node.id+=1;
node.c=cocc(j).c;
node.occ=cocc(j).occ;
node.term=1;
Call add_node;
End;
 
If debug Then
Call show;
 
Do While(pairs()); /* while there is more than one fatherless node */
Call mk_node; /* create a father node */
If debug Then
Call show;
End;
 
Call show; /* show the node table */
 
Call mk_trans; /* create the translate table */
Put Edit('The translate table:')(Skip,a);
Do i=1 To t_n; /* show it */
Put Edit(t(i).char,' -> ',t(i).code)(Skip,a,a,a);
End;
 
Call encode; /* encode the string s -> sc */
 
Put Edit('length(sc)=',length(sc)) /* show it */
(Skip,a,f(3));
Do i=1 By 70 To length(sc);
Put Edit(substr(sc,i,70))(Skip,a);
End;
 
Call decode; /* decode the string sc -> sr */
Put Edit('input : ',s)(skip,a,a);
Put Edit('result: ',sr)(skip,a,a);
Return;
 
add_node: Proc;
/*--------------------------------------------------------------------
* Insert the node according to increasing occurrences
*-------------------------------------------------------------------*/
il:
Do i=1 To m_n;
If m(i).occ>=node.occ Then Do;
Do k=m_n To i By -1;
m(k+1)=m(k);
End;
Leave il;
End;
End;
m(i)=node;
m_n+=1;
End;
 
show: Proc;
/*--------------------------------------------------------------------
* Show the contents of the node table
*-------------------------------------------------------------------*/
Put Edit('The list of nodes:')(Skip,a);
Put Edit('id c oc l r f d t')(Skip,a);
Do i=1 To m_n;
Put Edit(m(i).id,m(i).c,m(i).occ,
m(i).left,m(i).rite,m(i).father,m(i).digit,m(i).term)
(Skip,f(2),x(1),a,4(f(3)),f(2),f(3));
End;
End;
 
mk_node: Proc;
/*--------------------------------------------------------------------
* construct and store a new intermediate node or the top node
*-------------------------------------------------------------------*/
Dcl z Bin Fixed(15);
node='';
node.id=m_n+1; /* the next node id */
node.c='*';
ni=m_n+1;
loop:
Do i=1 To m_n; /* loop over node lines */
If m(i).father=0 Then Do; /* a fatherless node */
z=m(i).id; /* its id */
If node.left=0 Then Do; /* new node has no left child */
node.left=z; /* make this the lect child */
node.occ=m(i).occ; /* occurrences */
m(i).father=ni; /* store father info */
m(i).digit=0; /* digit 0 to be used */
father(z)=ni; /* remember z's father (redundant) */
End;
Else Do; /* New node has already left child */
node.rite=z; /* make this the right child */
node.occ=node.occ+m(i).occ; /* add in the occurrences */
m(i).father=ni; /* store father info */
m(i).digit=1; /* digit 1 to be used */
father(z)=ni; /* remember z's father (redundant) */
Leave loop;
End;
End;
End;
Call add_node;
End;
 
pairs: Proc Returns(Bit(1));
/*--------------------------------------------------------------------
* Return true if there are at least 2 fatherless nodes
*-------------------------------------------------------------------*/
Dcl i Bin Fixed(15);
Dcl cnt Bin Fixed(15) Init(0);
Do i=1 To m_n;
If m(i).father=0 Then Do;
cnt+=1;
If cnt>1 Then
Return('1'b);
End;
End;
Return('0'b);
End;
 
mk_trans: Proc;
/*--------------------------------------------------------------------
* Compute the codes for all terminal nodes (characters)
* and store the relation char -> code in array t(*)
*-------------------------------------------------------------------*/
Dcl (i,fi,fid,fidz,node,z) Bin Fixed(15);
Dcl code Char(20) Var;
Do i=1 To m_n; /* now we loop over all lines representing nodes */
If m(i).term Then Do; /* for each terminal node */
code=m(i).digit; /* its digit is the last code digit */
node=m(i).id; /* its id */
Do fi=1 To 1000; /* actually Forever */
fid=father(node); /* id of father */
If fid>0 Then Do; /* father exists */
fidz=zeile(fid); /* line that contains the father */
code=m(fidz).digit!!code; /* prepend the digit */
node=fid; /* look for next father */
End;
Else /* no father (we reached the top */
Leave;
End;
If length(code)>1 Then /* more than one character in input */
code=substr(code,2); /* remove the the top node's 0 */
call dbg(m(i).c!!' -> '!!code); /* character is encoded this way*/
ti_loop:
Do ti=1 To t_n;
If t(ti).char>m(i).c Then Do;
Do tj=t_n To ti By -1
t(tj+1)=t(tj);
End;
Leave ti_loop;
End;
End;
t(ti).char=m(i).c;
t(ti).code=code;
t_n+=1;
Call dbg(t(ti).char!!' -> '!!t(ti).code);
End;
End;
End;
 
zeile: Proc(nid) Returns(Bin Fixed(15));
/*--------------------------------------------------------------------
* find and return line number containing node-id
*-------------------------------------------------------------------*/
Dcl (nid,i) Bin Fixed(15);
do i=1 To m_n;
If m(i).id=nid Then
Return(i);
End;
Stop;
End;
 
dbg: Proc(txt);
/*--------------------------------------------------------------------
* Show text if debug is enabled
*-------------------------------------------------------------------*/
Dcl txt Char(*);
If debug Then
Put Skip List(txt);
End;
 
encode: Proc;
/*--------------------------------------------------------------------
* encode the string s -> sc
*-------------------------------------------------------------------*/
Dcl (i,j) Bin Fixed(15);
Do i=1 To length(s);
c=substr(s,i,1);
Do j=1 To t_n;
If c=t(j).char Then
Leave;
End;
sc=sc!!t(j).code;
End;
End;
 
decode: Proc;
/*--------------------------------------------------------------------
* decode the string sc -> sr
*-------------------------------------------------------------------*/
Dcl (i,j) Bin Fixed(15);
Do While(sc>'');
Do j=1 To t_n;
If substr(sc,1,length(t(j).code))=t(j).code Then
Leave;
End;
sr=sr!!t(j).char;
sc=substr(sc,length(t(j).code)+1);
End;
End;
 
End;</syntaxhighlight>
{{out}}
<pre>The list of nodes:
id c oc l r f d t
19 g 1 0 0 20 0 1
18 d 1 0 0 20 1 1
17 c 1 0 0 21 0 1
16 u 1 0 0 21 1 1
15 r 1 0 0 22 0 1
12 l 1 0 0 22 1 1
11 p 1 0 0 23 0 1
9 x 1 0 0 23 1 1
1 t 1 0 0 24 0 1
23 * 2 11 9 24 1 0
22 * 2 15 12 25 0 0
21 * 2 17 16 25 1 0
20 * 2 19 18 26 0 0
14 o 2 0 0 26 1 1
10 m 2 0 0 27 0 1
4 s 2 0 0 27 1 1
2 h 2 0 0 28 0 1
24 * 3 1 23 28 1 0
13 f 3 0 0 29 0 1
8 e 3 0 0 29 1 1
6 a 3 0 0 30 0 1
3 i 3 0 0 30 1 1
27 * 4 10 4 31 0 0
26 * 4 20 14 31 1 0
25 * 4 22 21 32 0 0
7 n 4 0 0 32 1 1
28 * 5 2 24 33 0 0
30 * 6 6 3 33 1 0
29 * 6 13 8 34 0 0
5 6 0 0 34 1 1
32 * 8 25 7 35 0 0
31 * 8 27 26 35 1 0
33 * 11 28 30 36 0 0
34 * 12 29 5 36 1 0
35 * 16 32 31 37 0 0
36 * 23 33 34 37 1 0
37 * 39 35 36 0 0 0
The translate table:
-> 111
a -> 1010
c -> 00010
d -> 01101
e -> 1101
f -> 1100
g -> 01100
h -> 1000
i -> 1011
l -> 00001
m -> 0100
n -> 001
o -> 0111
p -> 100110
r -> 00000
s -> 0101
t -> 10010
u -> 00011
x -> 100111
length(sc)=157
1001010001011010111110110101111101000111111011001111010010010011000001
1101111110001110000011110000001111001100010010100011111101001000100111
01101101100101100
input : this is an example for huffman encoding
result: this is an example for huffman encoding</pre>
 
=={{header|PowerShell}}==
{{works with|PowerShell|2}}
<syntaxhighlight lang="powershell">
function Get-HuffmanEncodingTable ( $String )
{
# Create leaf nodes
$ID = 0
$Nodes = [char[]]$String |
Group-Object |
ForEach { $ID++; $_ } |
Select @{ Label = 'Symbol' ; Expression = { $_.Name } },
@{ Label = 'Count' ; Expression = { $_.Count } },
@{ Label = 'ID' ; Expression = { $ID } },
@{ Label = 'Parent' ; Expression = { 0 } },
@{ Label = 'Code' ; Expression = { '' } }
# Grow stems under leafs
ForEach ( $Branch in 2..($Nodes.Count) )
{
# Get the two nodes with the lowest count
$LowNodes = $Nodes | Where Parent -eq 0 | Sort Count | Select -First 2
# Create a new stem node
$ID++
$Nodes += '' |
Select @{ Label = 'Symbol' ; Expression = { '' } },
@{ Label = 'Count' ; Expression = { $LowNodes[0].Count + $LowNodes[1].Count } },
@{ Label = 'ID' ; Expression = { $ID } },
@{ Label = 'Parent' ; Expression = { 0 } },
@{ Label = 'Code' ; Expression = { '' } }
# Put the two nodes in the new stem node
$LowNodes[0].Parent = $ID
$LowNodes[1].Parent = $ID
# Assign 0 and 1 to the left and right nodes
$LowNodes[0].Code = '0'
$LowNodes[1].Code = '1'
}
# Assign coding to nodes
ForEach ( $Node in $Nodes[($Nodes.Count-2)..0] )
{
$Node.Code = ( $Nodes | Where ID -eq $Node.Parent ).Code + $Node.Code
}
$EncodingTable = $Nodes | Where { $_.Symbol } | Select Symbol, Code | Sort Symbol
return $EncodingTable
}
# Get table for given string
$String = "this is an example for huffman encoding"
$HuffmanEncodingTable = Get-HuffmanEncodingTable $String
# Display table
$HuffmanEncodingTable | Format-Table -AutoSize
# Encode string
$EncodedString = $String
ForEach ( $Node in $HuffmanEncodingTable )
{
$EncodedString = $EncodedString.Replace( $Node.Symbol, $Node.Code )
}
$EncodedString
</syntaxhighlight>
{{out}}
<pre>
Symbol Code
------ ----
101
a 1100
c 01011
d 01100
e 1101
f 1110
g 01110
h 11111
i 1001
l 11110
m 0011
n 000
o 0100
p 10001
r 01111
s 0010
t 01010
u 01101
x 10000
 
 
0101011111100100101011001001010111000001011101100001100001110001111101101101111001000111110111111011011110111000111100000101110100001011010001100100100001110
</pre>
 
=={{header|Prolog}}==
Works with SWI-Prolog
<langsyntaxhighlight Prologlang="prolog">huffman :-
L = 'this is an example for huffman encoding',
atom_chars(L, LA),
Line 3,022 ⟶ 5,484:
N1 is N + 1.
run(V, [Other|RRest], [1,V], [Other|RRest]):-
dif(V, Other).</langsyntaxhighlight>
{{out}}
Output :
<pre> ?- huffman.
Symbol Weight Code
Line 3,049 ⟶ 5,511:
=={{header|PureBasic}}==
{{works with|PureBasic|4.50}}
<syntaxhighlight lang="purebasic">
<lang PureBasic>OpenConsole()
OpenConsole()
 
SampleString.s="this is an example for huffman encoding"
datalen=Len(SampleString)
 
Structure ztree
linked.c
Line 3,063 ⟶ 5,526:
EndStructure
 
Dim memc.c(0datalen)
memcCopyMemory()=@SampleString, @memc(0), datalen * SizeOf(Character))
 
Dim tree.ztree(255)
 
For i=0 To datalen-1
tree(memc(i))\char=memc(i)
Line 3,073 ⟶ 5,536:
tree(memc(i))\ischar=1
Next
 
SortStructuredArray(tree(),#PB_Sort_Descending,OffsetOf(ztree\number),#PB_Sort_CharacterPB_Integer)
 
For i=0 To 255
If tree(i)\number=0
Line 3,082 ⟶ 5,545:
EndIf
Next
 
dimsize=ArraySize(tree())
Repeat
Line 3,098 ⟶ 5,561:
EndIf
Next
If min1=0 Or min2=0
Break
EndIf
dimsize+1
ReDim tree(dimsize)
Line 3,111 ⟶ 5,574:
tree(hmin2)\linked=1
ForEver
 
i=0
While tree(i)\ischar=1
Line 3,130 ⟶ 5,593:
Wend
Input()
CloseConsole()
</syntaxhighlight>
 
{{out}}
CloseConsole()</lang>
 
output:
<pre> 110
n 000
Line 3,159 ⟶ 5,623:
The output is sorted first on length of the code, then on the symbols.
 
<langsyntaxhighlight lang="python">from heapq import heappush, heappop, heapify
from collections import defaultdict
 
Line 3,185 ⟶ 5,649:
print "Symbol\tWeight\tHuffman Code"
for p in huff:
print "%s\t%s\t%s" % (p[0], symb2freq[p[0]], p[1])</langsyntaxhighlight>
 
{{out}}
Example output:
<pre>Symbol Weight Huffman Code
6 101
Line 3,211 ⟶ 5,675:
An extension to the method outlined above is given [http://paddy3118.blogspot.com/2009/04/abuse-of-pythons-in-built-data.html here].
 
===Alternative===
 
This implementation creates an explicit tree structure, which is used during decoding. We also make use of a "pseudo end of file" symbol and padding bits to facilitate reading and writing encoded data to from/to a file.
=={{header|Racket}}==
<lang racket>
#lang racket
 
<syntaxhighlight lang="python">"""Huffman encoding and decoding. Requires Python >= 3.7."""
;; http://rosettacode.org/wiki/Huffman_coding
from __future__ import annotations
 
from collections import Counter
;; Building a huffman encoding tree.
(require data/heap)
 
from heapq import heapify
;; A node is either an interior, or a leaf.
from heapq import heappush
;; In either case, they record an item with an associated frequency.
from heapq import heappop
(struct interior (freq left right) #:transparent)
(struct leaf (val freq) #:transparent)
 
from itertools import chain
;; node-freq: node -> number
from itertools import islice
(define (node-freq a-node)
(cond [(interior? a-node) (interior-freq a-node)]
[(leaf? a-node) (leaf-freq a-node)]))
 
from typing import BinaryIO
from typing import Dict
from typing import Iterable
from typing import Optional
from typing import Tuple
 
 
LEFT_BIT = "0"
RIGHT_BIT = "1"
WORD_SIZE = 8 # Assumed to be a multiple of 8.
READ_SIZE = WORD_SIZE // 8
P_EOF = 1 << WORD_SIZE
 
 
class Node:
"""Huffman tree node."""
 
def __init__(
self,
weight: int,
symbol: Optional[int] = None,
left: Optional[Node] = None,
right: Optional[Node] = None,
):
self.weight = weight
self.symbol = symbol
self.left = left
self.right = right
 
def is_leaf(self) -> bool:
"""Return `True` if this node is a leaf node, or `False` otherwise."""
return self.left is None and self.right is None
 
def __lt__(self, other: Node) -> bool:
return self.weight < other.weight
 
 
def huffman_tree(weights: Dict[int, int]) -> Node:
"""Build a prefix tree from a map of symbol frequencies."""
heap = [Node(v, k) for k, v in weights.items()]
heapify(heap)
 
# Pseudo end-of-file with a weight of 1.
heappush(heap, Node(1, P_EOF))
 
while len(heap) > 1:
left, right = heappop(heap), heappop(heap)
node = Node(weight=left.weight + right.weight, left=left, right=right)
heappush(heap, node)
 
return heappop(heap)
 
 
def huffman_table(tree: Node) -> Dict[int, str]:
"""Build a table of prefix codes by visiting every leaf node in `tree`."""
codes: Dict[int, str] = {}
 
def walk(node: Optional[Node], code: str = ""):
if node is None:
return
 
if node.is_leaf():
assert node.symbol
codes[node.symbol] = code
return
 
walk(node.left, code + LEFT_BIT)
walk(node.right, code + RIGHT_BIT)
 
walk(tree)
return codes
 
 
def huffman_encode(data: bytes) -> Tuple[Iterable[bytes], Node]:
"""Encode the given byte string using Huffman coding.
 
Returns the encoded byte stream and the Huffman tree required to
decode those bytes.
"""
weights = Counter(data)
tree = huffman_tree(weights)
table = huffman_table(tree)
return _encode(data, table), tree
 
 
def huffman_decode(data: Iterable[bytes], tree: Node) -> bytes:
"""Decode the given byte stream using a Huffman tree."""
return bytes(_decode(_bits_from_bytes(data), tree))
 
 
def _encode(stream: Iterable[int], codes: Dict[int, str]) -> Iterable[bytes]:
bits = chain(chain.from_iterable(codes[s] for s in stream), codes[P_EOF])
 
# Pack bits (stream of 1s and 0s) one word at a time.
while True:
word = "".join(islice(bits, WORD_SIZE)) # Most significant bit first.
if not word:
break
 
# Pad last bits if they don't align to a whole word.
if len(word) < WORD_SIZE:
word = word.ljust(WORD_SIZE, "0")
 
# Byte order becomes relevant when READ_SIZE > 1.
yield int(word, 2).to_bytes(READ_SIZE, byteorder="big", signed=False)
 
 
def _decode(bits: Iterable[str], tree: Node) -> Iterable[int]:
node = tree
 
for bit in bits:
if bit == LEFT_BIT:
assert node.left
node = node.left
else:
assert node.right
node = node.right
 
if node.symbol == P_EOF:
break
 
if node.is_leaf():
assert node.symbol
yield node.symbol
node = tree # Back to the top of the tree.
 
 
def _word_to_bits(word: bytes) -> str:
"""Return the binary representation of a word given as a byte string,
including leading zeros up to WORD_SIZE.
 
For example, when WORD_SIZE is 8:
_word_to_bits(b'65') == '01000001'
"""
i = int.from_bytes(word, "big")
return bin(i)[2:].zfill(WORD_SIZE)
 
 
def _bits_from_file(file: BinaryIO) -> Iterable[str]:
"""Generate a stream of bits (strings of either "0" or "1") from file-like
object `file`, opened in binary mode."""
word = file.read(READ_SIZE)
while word:
yield from _word_to_bits(word)
word = file.read(READ_SIZE)
 
 
def _bits_from_bytes(stream: Iterable[bytes]) -> Iterable[str]:
"""Generate a stream of bits (strings of either "0" or "1") from an
iterable of single byte byte-strings."""
return chain.from_iterable(_word_to_bits(byte) for byte in stream)
 
 
def main():
"""Example usage."""
s = "this is an example for huffman encoding"
data = s.encode() # Need a byte string
encoded, tree = huffman_encode(data)
 
# Pretty print the Huffman table
print(f"Symbol Code\n------ ----")
for k, v in sorted(huffman_table(tree).items(), key=lambda x: len(x[1])):
print(f"{chr(k):<6} {v}")
 
# Print the bit pattern of the encoded data
print("".join(_bits_from_bytes(encoded)))
 
# Encode then decode
decoded = huffman_decode(*huffman_encode(data))
print(decoded.decode())
 
 
if __name__ == "__main__":
main()
</syntaxhighlight>
 
{{out}}
<pre>
Symbol Code
------ ----
n 000
110
m 0010
h 0101
i 1001
f 1010
e 1011
a 1110
r 00110
l 00111
c 01000
u 01001
x 01100
d 01101
t 01110
p 01111
Ā 10000
g 10001
o 11110
s 11111
011100101100111111110100111111110111000011010110110011100010011110011110111101010111100011011001010100110101010001011100001101011000010001111001101100100010001100000000
this is an example for huffman encoding
</pre>
 
=={{header|Quackery}}==
 
To use this code you will need to load the higher-order words defined at [[Higher-order functions#Quackery]] and the priority queue words defined at [[Priority queue#Quackery]].
 
The word <code>huffmantree</code> takes a string and generates a tree from it suitable for Huffman decoding. To decode a single character, start with the whole tree and either <code>0 peek</code> or <code>1 peek</code> according to the next bit in the compressed stream until you reach a number (ascii character code.)
 
The word <code>huffmanlist</code> will turn the Huffman tree into a nest of nests, each containing an ascii character code and a nest containing a Huffman code. The nests are sorted by ascii character code to facilitate binary splitting.
 
<syntaxhighlight lang="quackery"> [ 2dup peek 1+ unrot poke ] is itemincr ( [ n --> [ )
[ [ 0 128 of ] constant
swap witheach itemincr
' [ i^ join ] map
' [ 0 peek ] filter ] is countchars ( $ --> [ )
[ 0 peek dip [ 0 peek ] < ] is fewerchars ( [ [ --> b )
[ behead rot
behead rot + unrot
dip nested nested
join join ] is makenode ( [ [ --> [ )
[ [ dup pqsize 1 > while
frompq dip frompq
makenode topq again ]
frompq nip
0 pluck drop ] is maketree ( [ --> [ )
[ countchars
pqwith fewerchars
maketree ] is huffmantree ( $ --> [ )
[ stack ] is path.hfl ( --> s )
[ stack ] is list.hfl ( --> s )
forward is makelist ( [ --> )
[ dup size 1 = iff
[ 0 peek
path.hfl behead drop
nested join nested
list.hfl take
join
list.hfl put ] done
unpack
1 path.hfl put
makelist
0 path.hfl replace
makelist
path.hfl release ] resolves makelist ( [ --> )
[ sortwith
[ 0 peek swap 0 peek < ] ] is charsort ( [ --> [ )
[ [] list.hfl put
makelist
list.hfl take
charsort ] is huffmanlist ( [ --> [ )
[ sortwith
[ 1 peek size
swap 1 peek size < ] ] is codesort ( [ --> [ )
[ witheach
[ unpack swap
say ' "' emit
say '" ' echo cr ] ] is echohuff ( [ --> [ )
$ "this is an example for huffman encoding"
huffmantree
huffmanlist
say " Huffman codes sorted by character." cr
dup echohuff cr
say " Huffman codes sorted by code length." cr
codesort echohuff
</syntaxhighlight>
 
{{out}}
 
<pre> Huffman codes sorted by character.
" " [ 1 1 1 ]
"a" [ 1 0 0 1 ]
"c" [ 1 0 0 0 1 ]
"d" [ 0 0 1 1 0 ]
"e" [ 1 0 1 1 ]
"f" [ 1 1 0 1 ]
"g" [ 1 0 1 0 1 0 ]
"h" [ 0 0 0 1 ]
"i" [ 1 1 0 0 ]
"l" [ 1 0 0 0 0 ]
"m" [ 0 1 0 0 ]
"n" [ 0 1 1 ]
"o" [ 0 1 0 1 ]
"p" [ 1 0 1 0 1 1 ]
"r" [ 1 0 1 0 0 ]
"s" [ 0 0 0 0 ]
"t" [ 0 0 1 1 1 ]
"u" [ 0 0 1 0 0 ]
"x" [ 0 0 1 0 1 ]
 
Huffman codes sorted by code length.
" " [ 1 1 1 ]
"n" [ 0 1 1 ]
"a" [ 1 0 0 1 ]
"e" [ 1 0 1 1 ]
"f" [ 1 1 0 1 ]
"h" [ 0 0 0 1 ]
"i" [ 1 1 0 0 ]
"m" [ 0 1 0 0 ]
"o" [ 0 1 0 1 ]
"s" [ 0 0 0 0 ]
"c" [ 1 0 0 0 1 ]
"d" [ 0 0 1 1 0 ]
"l" [ 1 0 0 0 0 ]
"r" [ 1 0 1 0 0 ]
"t" [ 0 0 1 1 1 ]
"u" [ 0 0 1 0 0 ]
"x" [ 0 0 1 0 1 ]
"g" [ 1 0 1 0 1 0 ]
"p" [ 1 0 1 0 1 1 ]</pre>
 
=={{header|Racket}}==
<syntaxhighlight lang="racket">
#lang racket
(require data/heap
data/bit-vector)
;; A node is either an interior, or a leaf.
;; In either case, they record an item with an associated frequency.
(struct node (freq) #:transparent)
(struct interior node (left right) #:transparent)
(struct leaf node (val) #:transparent)
;; node<=?: node node -> boolean
;; Compares two nodes by frequency.
(define (node<=? x y)
(<= (node-freq x) (node-freq y)))
 
;; make-huffman-tree: (listof leaf) -> interior-node
(define (make-huffman-tree leaves)
(define a-heap (make-heap node<=?))
(heap-add-all! a-heap leaves)
(for ([i (in-range (sub1 (length leaves)))])
(define min-1 (heap-min a-heap))
(heap-remove-min! a-heap)
Line 3,247 ⟶ 6,046:
min-1 min-2)))
(heap-min a-heap))
 
;; huffman-tree->dictionary: node -> (hashof val string)
(define (huffman-tree->dictionary a-node)
(define ht (make-hash))
(let loop ([a-node a-node]
[path/rev '()])
(cond
[(interior? a-node)
(loop (interior-left a-node)
(cons #\0 path/rev))
(loop (interior-right a-node)
(cons #\1 path/rev))]
[(leaf? a-node)
(hash-set! ht (apply string (reverse path/rev)) (leaf-val a-node))]))
(for/hash ([(k v) ht])
(values v k)))
 
;; string->huffman-tree: string -> node
;; Given a string, produces its huffman tree. The leaves hold the characters
;; and their relative frequencies.
(define (string->huffman-tree str)
(define ht (make-hash))
(define n (sequence-length str))
(for ([ch str])
(hash-setupdate! ht ch (add1 (hash-refλ ht ch() 0))))
(make-huffman-tree
(for/list ([(k v) (in-hash ht)])
(leaf k (/ v n) k))))
 
;; make-encoder: node -> (string -> bit-vector)
 
;; Given a huffman tree, generates the encoder function.
;; make-encoder: node -> (string -> string)
(define (make-encoder a-tree)
(define dict (huffman-tree->dictionary a-tree))
(lambda (a-str)
(list->bit-vector (apply string-append (for/list ([ch a-str]) (hash-ref dict ch))))))
 
;; huffman-tree->dictionary: node -> (hashof val (listof boolean))
;; A helper for the encoder: maps characters to their code sequences.
(define (huffman-tree->dictionary a-node)
(define ht (make-hash))
(let loop ([a-node a-node]
[path/rev '()])
(cond
[(interior? a-node)
(loop (interior-left a-node) (cons #f path/rev))
(loop (interior-right a-node) (cons #t path/rev))]
[(leaf? a-node)
(hash-set! ht (reverse path/rev) (leaf-val a-node))]))
 
(for/hash ([(k v) ht])
;; make-decoder: node -> (string -> string)
(values v k)))
;; make-decoder: interior-node -> (bit-vector -> string)
;; Generates the decoder function from the tree.
(define (make-decoder a-tree)
(lambda (a-strbitvector)
(define-values (decoded/rev _)
(list->string
(let loopfor/fold ([charsdecoded/rev '(string->list a-str)]
[a-node a-tree])
(cond[bit a-bitvector])
[(leaf?define anext-node)
(cons (leaf-val a-node)cond
[(loopnot chars a-tree)bit)]
[ (empty?interior-left charsa-node) empty]
[(char=? (first chars) #\0)[else
(loop (restinterior-right charsa-node)]))
(cond [(interior-leftleaf? anext-node))]
(values (cons (leaf-val next-node) decoded/rev)
[else
(loop (rest chars a-tree)]
(interior-right a-node))])))))[else
(values decoded/rev next-node)])))
(apply string (reverse decoded/rev))))
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Example application:
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
(define msg "this is an example for huffman encoding")
(define tree (string->huffman-tree msg))
 
(define tree
(string->huffman-tree msg))
;; We can print out the mapping for inspection:
(huffman-tree->dictionary tree)
 
(define encode (make-encoder tree))
(define encoded (encode msg))
;; Here's what the encoded message looks like:
encoded
 
;; Here's what the encoded message looks like:
(bit-vector->string encoded)
(define decode (make-decoder tree))
;; Here's what the decoded message looks like:
(decode encoded)</syntaxhighlight>
</lang>
 
=={{header|Raku}}==
(formerly Perl 6)
 
===By building a tree===
 
This version uses nested <code>Array</code>s to build a tree [https://commons.wikimedia.org/wiki/File:HuffmanCodeAlg.png like shown in this diagram], and then recursively traverses the finished tree to accumulate the prefixes.
 
{{works with|rakudo|2015-12-17}}
<syntaxhighlight lang="raku" line>sub huffman (%frequencies) {
my @queue = %frequencies.map({ [.value, .key] }).sort;
while @queue > 1 {
given @queue.splice(0, 2) -> ([$freq1, $node1], [$freq2, $node2]) {
@queue = (|@queue, [$freq1 + $freq2, [$node1, $node2]]).sort;
}
}
hash gather walk @queue[0][1], '';
}
 
multi walk ($node, $prefix) { take $node => $prefix; }
multi walk ([$node1, $node2], $prefix) { walk $node1, $prefix ~ '0';
walk $node2, $prefix ~ '1'; }</syntaxhighlight>
 
===Without building a tree===
 
This version uses an <code>Array</code> of <code>Pair</code>s to implement a simple priority queue. Each value of the queue is a <code>Hash</code> mapping from letters to prefixes, and when the queue is reduced the hashes are merged on-the-fly, so that the last one remaining is the wanted Huffman table.
 
{{works with|rakudo|2015-12-17}}
<syntaxhighlight lang="raku" line>sub huffman (%frequencies) {
my @queue = %frequencies.map: { .value => (hash .key => '') };
while @queue > 1 {
@queue.=sort;
my $x = @queue.shift;
my $y = @queue.shift;
@queue.push: ($x.key + $y.key) => hash $x.value.deepmap('0' ~ *),
$y.value.deepmap('1' ~ *);
}
@queue[0].value;
}
 
# Testing
 
for huffman 'this is an example for huffman encoding'.comb.Bag {
say "'{.key}' : {.value}";
}
 
# To demonstrate that the table can do a round trip:
 
say '';
my $original = 'this is an example for huffman encoding';
 
my %encode-key = huffman $original.comb.Bag;
my %decode-key = %encode-key.invert;
my @codes = %decode-key.keys;
 
my $encoded = $original.subst: /./, { %encode-key{$_} }, :g;
my $decoded = $encoded .subst: /@codes/, { %decode-key{$_} }, :g;
 
.say for $original, $encoded, $decoded;</syntaxhighlight>
 
{{out}}
<pre>'x' : 11000
'p' : 01100
'h' : 0001
'g' : 00000
'a' : 1001
'e' : 1101
'd' : 110011
's' : 0111
'f' : 1110
'c' : 110010
'm' : 0010
' ' : 101
'n' : 010
'o' : 0011
'u' : 10001
't' : 10000
'i' : 1111
'r' : 01101
'l' : 00001
 
this is an example for huffman encoding
1000000011111011110111110111101100101010111011100010010010011000000111011011110001101101101000110001111011100010100101010111010101100100011110011111101000000
this is an example for huffman encoding</pre>
 
=={{header|Red}}==
<syntaxhighlight lang="red">Red [file: %huffy.red]
 
;; message to encode:
msg: "this is an example for huffman encoding"
 
;;map to collect leave knots per uniq character of message
m: make map! []
 
knot: make object! [
left: right: none ;; pointer to left/right sibling
code: none ;; first holds char for debugging, later binary code
count: depth: 1 ;;occurence of character - length of branch
]
 
;;-----------------------------------------
set-code: func ["recursive function to generate binary code sequence"
wknot
wcode [string!]] [
;;-----------------------------------------
either wknot/left = none [
wknot/code: wcode
] [
set-code wknot/left rejoin [wcode "1"]
set-code wknot/right rejoin [wcode "0"]
]
] ;;-- end func
 
;-------------------------------
merge-2knots: func ["function to merge 2 knots into 1 new"
t [block!]][
;-------------------------------
nknot: copy knot ;; create new knot
nknot/count: t/1/count + t/2/count
nknot/right: t/1
nknot/left: t/2
nknot/depth: t/1/depth + 1
tab: remove/part t 2 ;; delete first 2 knots
insert t nknot ;; insert new generated knot
] ;;-- end func
 
;; count occurence of characters, save in map: m
foreach chr msg [
either k: select/case m chr [
k/count: k/count + 1
][
put/case m chr nknot: copy knot
nknot/code: chr
]
]
 
;; create sortable block (=tab) for use as prio queue
foreach k keys-of m [ append tab: [] :m/:k ]
 
;; build tree
while [ 1 < length? tab][
sort/compare tab function [a b] [
a/count < b/count
or ( a/count = b/count and ( a/depth > b/depth ) )
]
merge-2knots tab ;; merge 2 knots with lowest count / max depth
]
 
set-code tab/1 "" ;; generate binary codes, save at leave knot
 
;; display codes
foreach k sort keys-of m [
print [k " = " m/:k/code]
append codes: "" m/:k/code
]
 
;; encode orig message string
foreach chr msg [
k: select/case m chr
append msg-new: "" k/code
]
 
print [ "length of encoded msg " length? msg-new]
print [ "length of (binary) codes " length? codes ]
 
print ["orig. message: " msg newline "encoded message: " "^/" msg-new]
prin "decoded: "
 
;; decode message (destructive! ):
while [ not empty? msg-new ][
foreach [k v] body-of m [
if t: find/match msg-new v/code [
prin k
msg-new: t
]
]
]
</syntaxhighlight>
{{out}}
<pre> = 111
a = 1101
c = 00101
d = 00100
e = 1011
f = 1100
g = 10010
h = 1000
i = 1010
l = 00000
m = 0001
n = 011
o = 0101
p = 00001
r = 00111
s = 0100
t = 100111
u = 100110
x = 00110
length of encoded msg 157
length of (binary) codes 85
orig. message: this is an example for huffman encoding
encoded message:
1001111000101001001111010010011111010111111011001101101000100001000001011111110001010011111110001001101100110000011101011111101101100101010100100101001110010
decoded: this is an example for huffman encoding
</pre>
 
=={{header|REXX}}==
<syntaxhighlight lang="rexx">/* REXX ---------------------------------------------------------------
* 27.12.2013 Walter Pachl
* 29.12.2013 -"- changed for test of s=xrange('00'x,'ff'x)
* 14.03.2018 -"- use format instead of right to diagnose size poblems
* Stem m contains eventually the following node data
* m.i.0id Node id
* m.i.0c character
* m.i.0o number of occurrences
* m.i.0l left child
* m.i.0r right child
* m.i.0f father
* m.i.0d digit (0 or 1)
* m.i.0t 1=a terminal node 0=an intermediate or the top node
*--------------------------------------------------------------------*/
Parse Arg s
If s='' Then
s='this is an example for huffman encoding'
Say 'We encode this string:'
Say s
debug=0
o.=0
c.=0
codel.=0
code.=''
father.=0
cl='' /* list of characters */
do i=1 To length(s)
Call memorize substr(s,i,1)
End
If debug Then Do
Do i=1 To c.0
c=c.i
Say i c o.c
End
End
n.=0
Do i=1 To c.0
c=c.i
n.i.0c=c
n.i.0o=o.c
n.i.0id=i
Call dbg i n.i.0id n.i.0c n.i.0o
End
n=c.0 /* number of nodes */
m.=0
Do i=1 To n /* construct initial array */
Do j=1 To m.0 /* sorted by occurrences */
If m.j.0o>n.i.0o Then
Leave
End
Do k=m.0 To j By -1
k1=k+1
m.k1.0id=m.k.0id
m.k1.0c =m.k.0c
m.k1.0o =m.k.0o
m.k1.0t =m.k.0t
End
m.j.0id=i
m.j.0c =n.i.0c
m.j.0o =n.i.0o
m.j.0t =1
m.0=m.0+1
End
If debug Then
Call show
 
Do While pairs()>1 /* while there are at least 2 fatherless nodes */
Call mknode /* create and fill a new father node */
If debug Then
Call show
End
 
Call show
c.=0
Do i=1 To m.0 /* now we loop over all lines representing nodes */
If m.i.0t Then Do /* for each terminal node */
code=m.i.0d /* its digit is the last code digit */
node=m.i.0id /* its id */
Do fi=1 To 1000 /* actually Forever */
fid=father.node /* id of father */
If fid<>0 Then Do /* father exists */
fidz=zeile(fid) /* line that contains the father */
code=m.fidz.0d||code /* prepend the digit */
node=fid /* look for next father */
End
Else /* no father (we reached the top */
Leave
End
If length(code)>1 Then /* more than one character in input */
code=substr(code,2) /* remove the the top node's 0 */
call dbg m.i.0c '->' code /* character is encoded this way */
char=m.i.0c
code.char=code
z=codel.0+1
codel.z=code
codel.0=z
char.code=char
End
End
 
Call show_char2code /* show used characters and corresponding codes */
 
codes.=0 /* now we build the array of codes/characters */
Do j=1 To codel.0
z=codes.0+1
code=codel.j
codes.z=code
chars.z=char.code
codes.0=z
Call dbg codes.z '----->' chars.z
End
 
sc='' /* here we ecnode the string */
Do i=1 To length(s) /* loop over input */
c=substr(s,i,1) /* a character */
sc=sc||code.c /* append the corresponding code */
End
Say 'Length of encoded string:' length(sc)
Do i=1 To length(sc) by 70
Say substr(sc,i,70)
End
 
sr='' /* now decode the string */
Do si=1 To 999 While sc<>''
Do i=codes.0 To 1 By -1 /* loop over codes */
cl=length(codes.i) /* length of code */
If left(sc,cl)==codes.i Then Do /* found on top of string */
sr=sr||chars.i /* append character to result */
sc=substr(sc,cl+1) /* cut off the used code */
Leave /* this was one character */
End
End
End
Say 'Input ="'s'"'
Say 'result="'sr'"'
 
Exit
 
show:
/*---------------------------------------------------------------------
* show all lines representing node data
*--------------------------------------------------------------------*/
Say ' i pp id c f l r d'
Do i=1 To m.0
Say format(i,3) format(m.i.0o,4) format(m.i.0id,3),
format(m.i.0f,3) format(m.i.0l,3) format(m.i.0r,3) m.i.0d m.i.0t
End
Call dbg copies('-',21)
Return
 
pairs: Procedure Expose m.
/*---------------------------------------------------------------------
* return number of fatherless nodes
*--------------------------------------------------------------------*/
res=0
Do i=1 To m.0
If m.i.0f=0 Then
res=res+1
End
Return res
 
mknode:
/*---------------------------------------------------------------------
* construct and store a new intermediate or the top node
*--------------------------------------------------------------------*/
new.=0
ni=m.0+1 /* the next node id */
Do i=1 To m.0 /* loop over node lines */
If m.i.0f=0 Then Do /* a fatherless node */
z=m.i.0id /* its id */
If new.0l=0 Then Do /* new node has no left child */
new.0l=z /* make this the lect child */
new.0o=m.i.0o /* occurrences */
m.i.0f=ni /* store father info */
m.i.0d='0' /* digit 0 to be used */
father.z=ni /* remember z's father (redundant) */
End
Else Do /* New node has already left child */
new.0r=z /* make this the right child */
new.0o=new.0o+m.i.0o /* add in the occurrences */
m.i.0f=ni /* store father info */
m.i.0d=1 /* digit 1 to be used */
father.z=ni /* remember z's father (redundant) */
Leave
End
End
End
Do i=1 To m.0 /* Insert new node according to occurrences */
If m.i.0o>=new.0o Then Do
Do k=m.0 To i By -1
k1=k+1
m.k1.0id=m.k.0id
m.k1.0o =m.k.0o
m.k1.0c =m.k.0c
m.k1.0l =m.k.0l
m.k1.0r =m.k.0r
m.k1.0f =m.k.0f
m.k1.0d =m.k.0d
m.k1.0t =m.k.0t
End
Leave
End
End
m.i.0id=ni
m.i.0c ='*'
m.i.0o =new.0o
m.i.0l =new.0l
m.i.0r =new.0r
m.i.0t =0
father.ni=0
m.0=ni
Return
 
zeile:
/*---------------------------------------------------------------------
* find and return line number containing node-id
*--------------------------------------------------------------------*/
do fidz=1 To m.0
If m.fidz.0id=arg(1) Then
Return fidz
End
Call dbg arg(1) 'not found'
Pull .
 
dbg:
/*---------------------------------------------------------------------
* Show text if debug is enabled
*--------------------------------------------------------------------*/
If debug=1 Then
Say arg(1)
Return
 
 
memorize: Procedure Expose c. o.
/*---------------------------------------------------------------------
* store characters and corresponding occurrences
*--------------------------------------------------------------------*/
Parse Arg c
If o.c=0 Then Do
z=c.0+1
c.z=c
c.0=z
End
o.c=o.c+1
Return
 
show_char2code:
/*---------------------------------------------------------------------
* show used characters and corresponding codes
*--------------------------------------------------------------------*/
cl=xrange('00'x,'ff'x)
Say 'char --> code'
Do While cl<>''
Parse Var cl c +1 cl
If code.c<>'' Then
Say ' 'c '-->' code.c
End
Return</syntaxhighlight>
{{out}}
<pre>We encode this string:
this is an example for huffman encoding
i pp id c f l r d
1 1 1 20 0 0 0 1
2 1 9 20 0 0 1 1
3 1 11 21 0 0 0 1
4 1 12 21 0 0 1 1
5 1 15 22 0 0 0 1
6 1 16 22 0 0 1 1
7 1 17 23 0 0 0 1
8 1 18 23 0 0 1 1
9 1 19 24 0 0 0 1
10 2 23 24 17 18 1 0
11 2 22 25 15 16 0 0
12 2 21 25 11 12 1 0
13 2 20 26 1 9 0 0
14 2 2 26 0 0 1 1
15 2 4 27 0 0 0 1
16 2 10 27 0 0 1 1
17 2 14 28 0 0 0 1
18 3 24 28 19 23 1 0
19 3 3 29 0 0 0 1
20 3 6 29 0 0 1 1
21 3 8 30 0 0 0 1
22 3 13 30 0 0 1 1
23 4 27 31 4 10 0 0
24 4 26 31 20 2 1 0
25 4 25 32 22 21 0 0
26 4 7 32 0 0 1 1
27 5 28 33 14 24 0 0
28 6 30 33 8 13 1 0
29 6 29 34 3 6 0 0
30 6 5 34 0 0 1 1
31 8 32 35 25 7 0 0
32 8 31 35 27 26 1 0
33 11 33 36 28 30 0 0
34 12 34 36 29 5 1 0
35 16 35 37 32 31 0 0
36 23 36 37 33 34 1 0
37 39 37 0 35 36 0 0
char --> code
--> 111
a --> 1101
c --> 100110
d --> 100111
e --> 1010
f --> 1011
g --> 10010
h --> 0111
i --> 1100
l --> 00011
m --> 0101
n --> 001
o --> 1000
p --> 00010
r --> 00000
s --> 0100
t --> 01100
u --> 00001
x --> 01101
Length of encoded string: 157
0110001111100010011111000100111110100111110100110111010101000100001110
1011110111000000001110111000011011101101011101001111101000110011010001
00111110000110010
Input ="this is an example for huffman encoding"
result="this is an example for huffman encoding"</pre>
 
=={{header|Ruby}}==
Uses a {{libheader|RubyGems}} package [http://ruby.brian-amberg.de/priority-queue/ PriorityQueue]
<langsyntaxhighlight lang="ruby">require 'priority_queue'
 
def huffman_encoding(str)
Line 3,374 ⟶ 6,711:
enc = encode(str, encoding)
dec = decode(enc, encoding)
puts "success!" if str == dec</langsyntaxhighlight>
<pre>[" ", "111"]
["a", "1011"]
Line 3,395 ⟶ 6,732:
["x", "10100"]
success!
</pre>
 
=={{header|Rust}}==
 
Adapted C++ solution.
 
<syntaxhighlight lang="rust">
use std::collections::BTreeMap;
use std::collections::binary_heap::BinaryHeap;
 
#[derive(Debug, Eq, PartialEq)]
enum NodeKind {
Internal(Box<Node>, Box<Node>),
Leaf(char),
}
 
#[derive(Debug, Eq, PartialEq)]
struct Node {
frequency: usize,
kind: NodeKind,
}
 
impl Ord for Node {
fn cmp(&self, rhs: &Self) -> std::cmp::Ordering {
rhs.frequency.cmp(&self.frequency)
}
}
 
impl PartialOrd for Node {
fn partial_cmp(&self, rhs: &Self) -> Option<std::cmp::Ordering> {
Some(self.cmp(&rhs))
}
}
 
type HuffmanCodeMap = BTreeMap<char, Vec<u8>>;
 
fn main() {
let text = "this is an example for huffman encoding";
 
let mut frequencies = BTreeMap::new();
for ch in text.chars() {
*frequencies.entry(ch).or_insert(0) += 1;
}
 
let mut prioritized_frequencies = BinaryHeap::new();
for counted_char in frequencies {
prioritized_frequencies.push(Node {
frequency: counted_char.1,
kind: NodeKind::Leaf(counted_char.0),
});
}
 
while prioritized_frequencies.len() > 1 {
let left_child = prioritized_frequencies.pop().unwrap();
let right_child = prioritized_frequencies.pop().unwrap();
prioritized_frequencies.push(Node {
frequency: right_child.frequency + left_child.frequency,
kind: NodeKind::Internal(Box::new(left_child), Box::new(right_child)),
});
}
 
let mut codes = HuffmanCodeMap::new();
generate_codes(
prioritized_frequencies.peek().unwrap(),
vec![0u8; 0],
&mut codes,
);
 
for item in codes {
print!("{}: ", item.0);
for bit in item.1 {
print!("{}", bit);
}
println!();
}
}
 
fn generate_codes(node: &Node, prefix: Vec<u8>, out_codes: &mut HuffmanCodeMap) {
match node.kind {
NodeKind::Internal(ref left_child, ref right_child) => {
let mut left_prefix = prefix.clone();
left_prefix.push(0);
generate_codes(&left_child, left_prefix, out_codes);
 
let mut right_prefix = prefix;
right_prefix.push(1);
generate_codes(&right_child, right_prefix, out_codes);
}
NodeKind::Leaf(ch) => {
out_codes.insert(ch, prefix);
}
}
}
</syntaxhighlight>
 
Output:
<pre>
: 110
a: 1001
c: 101010
d: 10001
e: 1111
f: 1011
g: 101011
h: 0101
i: 1110
l: 01110
m: 0011
n: 000
o: 0010
p: 01000
r: 01001
s: 0110
t: 01111
u: 10100
x: 10000
</pre>
 
=={{header|Scala}}==
{{works with|scala|2.8}}
<langsyntaxhighlight lang="scala">object Huffman {
import scala.collection.mutable.{Map, PriorityQueue}
Line 3,447 ⟶ 6,900:
}
}
}</langsyntaxhighlight>
 
Output:
 
{{out}}
<pre>
Char Weight Encoding
Line 3,472 ⟶ 6,924:
n: 4/39 100
: 6/39 010
</pre>
==={{header|Scala (Alternate version)}}===
{{works with|scala|2.11.7}}
<syntaxhighlight lang="scala">
// this version uses immutable data only, recursive functions and pattern matching
object Huffman {
sealed trait Tree[+A]
case class Leaf[A](value: A) extends Tree[A]
case class Branch[A](left: Tree[A], right: Tree[A]) extends Tree[A]
 
// recursively build the binary tree needed to Huffman encode the text
def merge(xs: List[(Tree[Char], Int)]): List[(Tree[Char], Int)] = {
if (xs.length == 1) xs else {
val l = xs.head
val r = xs.tail.head
val merged = (Branch(l._1, r._1), l._2 + r._2)
merge((merged :: xs.drop(2)).sortBy(_._2))
}
}
 
// recursively search the branches of the tree for the required character
def contains(tree: Tree[Char], char: Char): Boolean = tree match {
case Leaf(c) => if (c == char) true else false
case Branch(l, r) => contains(l, char) || contains(r, char)
}
 
// recursively build the path string required to traverse the tree to the required character
def encodeChar(tree: Tree[Char], char: Char): String = {
def go(tree: Tree[Char], char: Char, code: String): String = tree match {
case Leaf(_) => code
case Branch(l, r) => if (contains(l, char)) go(l, char, code + '0') else go(r, char, code + '1')
}
go(tree, char, "")
}
 
def main(args: Array[String]) {
val text = "this is an example for huffman encoding"
// transform the text into a list of tuples.
// each tuple contains a Leaf node containing a unique character and an Int representing that character's weight
val frequencies = text.groupBy(chars => chars).mapValues(group => group.length).toList.map(x => (Leaf(x._1), x._2)).sortBy(_._2)
// build the Huffman Tree for this text
val huffmanTree = merge(frequencies).head._1
// output the resulting character codes
println("Char\tWeight\tCode")
frequencies.foreach(x => println(x._1.value + "\t" + x._2 + s"/${text.length}" + s"\t${encodeChar(huffmanTree, x._1.value)}"))
}
}</syntaxhighlight>
<pre>
Char Weight Code
x 1/39 01100
t 1/39 01101
u 1/39 00010
g 1/39 00011
l 1/39 00000
p 1/39 00001
c 1/39 100110
r 1/39 100111
d 1/39 10010
s 2/39 0111
m 2/39 0100
h 2/39 0101
o 2/39 1000
e 3/39 1100
f 3/39 1101
a 3/39 1010
i 3/39 1011
n 4/39 001
6/39 111
</pre>
 
=={{header|Scheme}}==
 
<syntaxhighlight lang="scheme">(define (char-freq port table)
(if
(eof-object? (peek-char port))
table
(char-freq port (add-char (read-char port) table))))
 
(define (add-char char table)
(cond
((null? table) (list (list char 1)))
((eq? (caar table) char) (cons (list char (+ (cadar table) 1)) (cdr table)))
(#t (cons (car table) (add-char char (cdr table))))))
 
(define (nodeify table)
(map (lambda (x) (list x '() '())) table))
 
(define node-freq cadar)
 
(define (huffman-tree nodes)
(let ((queue (sort nodes (lambda (x y) (< (node-freq x) (node-freq y))))))
(if
(null? (cdr queue))
(car queue)
(huffman-tree
(cons
(list
(list 'notleaf (+ (node-freq (car queue)) (node-freq (cadr queue))))
(car queue)
(cadr queue))
(cddr queue))))))
 
(define (list-encodings tree chars)
(for-each (lambda (c) (format #t "~a:~a~%" c (encode c tree))) chars))
 
(define (encode char tree)
(cond
((null? tree) #f)
((eq? (caar tree) char) '())
(#t
(let ((left (encode char (cadr tree))) (right (encode char (caddr tree))))
(cond
((not (or left right)) #f)
(left (cons #\1 left))
(right (cons #\0 right)))))))
 
(define (decode digits tree)
(cond
((not (eq? (caar tree) 'notleaf)) (caar tree))
((eq? (car digits) #\0) (decode (cdr digits) (cadr tree)))
(#t (decode (cdr digits) (caddr tree)))))
 
(define input "this is an example for huffman encoding")
(define freq-table (char-freq (open-input-string input) '()))
(define tree (huffman-tree (nodeify freq-table)))
(list-encodings tree (map car freq-table))</syntaxhighlight>
 
{{out}}
<pre>
t:(1 0 0 1 1)
h:(1 0 0 0)
i:(0 0 1 1)
s:(1 0 1 1)
:(0 0 0)
a:(0 0 1 0)
n:(1 1 0)
e:(0 1 0 1)
x:(1 0 0 1 0)
m:(1 0 1 0)
p:(1 1 1 0 1)
l:(1 1 1 0 0)
f:(0 1 0 0)
o:(0 1 1 1)
r:(1 1 1 1 1)
u:(1 1 1 1 0)
c:(0 1 1 0 0 1)
d:(0 1 1 0 0 0)
g:(0 1 1 0 1)
</pre>
 
=={{header|SETL}}==
<langsyntaxhighlight SETLlang="setl">var forest := {}, encTab := {};
 
plaintext := 'this is an example for huffman encoding';
Line 3,512 ⟶ 7,112:
forest less:= [f, n];
return [f, n];
end proc;</langsyntaxhighlight>
 
=={{header|Sidef}}==
<syntaxhighlight lang="ruby">func walk(n, s, h) {
if (n.contains(:a)) {
h{n{:a}} = s
say "#{n{:a}}: #{s}"
return nil
}
walk(n{:0}, s+'0', h)
walk(n{:1}, s+'1', h)
}
 
func make_tree(text) {
var letters = Hash()
text.each { |c| letters{c} := 0 ++ }
var nodes = letters.keys.map { |l|
Hash(a => l, freq => letters{l})
}
 
var n = Hash()
while (nodes.sort_by!{|c| c{:freq} }.len > 1) {
n = Hash(:0 => nodes.shift, :1 => nodes.shift)
n{:freq} = (n{:0}{:freq} + n{:1}{:freq})
nodes.append(n)
}
 
walk(n, "", n{:tree} = Hash())
return n
}
 
func encode(s, t) {
t = t{:tree}
s.chars.map{|c| t{c} }.join
}
 
func decode (enc, tree) {
var n = tree
var out = ""
 
enc.each {|bit|
n = n{bit}
if (n.contains(:a)) {
out += n{:a}
n = tree
}
}
 
return out
}
 
var text = "this is an example for huffman encoding"
var tree = make_tree(text)
var enc = encode(text, tree)
 
say enc
say decode(enc, tree)</syntaxhighlight>
{{out}}
<pre>n: 000
s: 0010
o: 0011
h: 0100
l: 01010
g: 01011
x: 01100
c: 01101
d: 01110
u: 01111
p: 10000
t: 10001
i: 1001
: 101
f: 1100
a: 1101
e: 1110
r: 11110
m: 11111
1000101001001001010110010010101110100010111100110011011111110000010101110101110000111111010101000111111001100111111101000101111000001101001101110100100001011
this is an example for huffman encoding</pre>
 
=={{header|Standard ML}}==
{{works with|SML/NJ}}
<langsyntaxhighlight lang="sml">datatype 'a huffman_tree =
Leaf of 'a
| Node of 'a huffman_tree * 'a huffman_tree
Line 3,572 ⟶ 7,250:
print "SYMBOL\tHUFFMAN CODE\n";
printCodes ([], tree)
end</langsyntaxhighlight>
 
=={{header|Swift}}==
Rather than a priority queue of subtrees, we use the strategy of two sorted lists, one for leaves and one for nodes, and "merge" them as we iterate through them, taking advantage of the fact that any new nodes we create are bigger than any previously created nodes, so go at the end of the nodes list.
{{works with|Swift|2+}}
<syntaxhighlight lang="swift">enum HuffmanTree<T> {
case Leaf(T)
indirect case Node(HuffmanTree<T>, HuffmanTree<T>)
func printCodes(prefix: String) {
switch(self) {
case let .Leaf(c):
print("\(c)\t\(prefix)")
case let .Node(l, r):
l.printCodes(prefix + "0")
r.printCodes(prefix + "1")
}
}
}
 
func buildTree<T>(freqs: [(T, Int)]) -> HuffmanTree<T> {
assert(freqs.count > 0, "must contain at least one character")
// leaves sorted by increasing frequency
let leaves : [(Int, HuffmanTree<T>)] = freqs.sort { (p1, p2) in p1.1 < p2.1 }.map { (x, w) in (w, .Leaf(x)) }
// nodes sorted by increasing frequency
var nodes = [(Int, HuffmanTree<T>)]()
// iterate through leaves and nodes in order of increasing frequency
for var i = 0, j = 0; ; {
assert(i < leaves.count || j < nodes.count)
// get subtree of least frequency
var e1 : (Int, HuffmanTree<T>)
if j == nodes.count || i < leaves.count && leaves[i].0 < nodes[j].0 {
e1 = leaves[i]
i++
} else {
e1 = nodes[j]
j++
}
// if there's no subtrees left, then that one was the answer
if i == leaves.count && j == nodes.count {
return e1.1
}
// get next subtree of least frequency
var e2 : (Int, HuffmanTree<T>)
if j == nodes.count || i < leaves.count && leaves[i].0 < nodes[j].0 {
e2 = leaves[i]
i++
} else {
e2 = nodes[j]
j++
}
// create node from two subtrees
nodes.append((e1.0 + e2.0, .Node(e1.1, e2.1)))
}
}
 
func getFreqs<S : SequenceType where S.Generator.Element : Hashable>(seq: S) -> [(S.Generator.Element, Int)] {
var freqs : [S.Generator.Element : Int] = [:]
for c in seq {
freqs[c] = (freqs[c] ?? 0) + 1
}
return Array(freqs)
}
 
let str = "this is an example for huffman encoding"
let charFreqs = getFreqs(str.characters)
let tree = buildTree(charFreqs)
print("Symbol\tHuffman code")
tree.printCodes("")</syntaxhighlight>
{{out}}
<pre>
Symbol Huffman code
u 00000
t 00001
d 00010
r 00011
c 00100
l 00101
o 0011
m 0100
s 0101
n 011
h 1000
g 10010
p 100110
x 100111
f 1010
a 1011
i 1100
e 1101
111
</pre>
 
=={{header|Tcl}}==
{{tcllib|struct::prioqueue}}
<langsyntaxhighlight lang="tcl">package require Tcl 8.5
package require struct::prioqueue
 
Line 3,598 ⟶ 7,369:
set encoding [walkTree [$pq get]]
set map [dict create {*}[lreverse $encoding]]
if {$opts(-dump)} {
foreach key{char huffCode} [lsort -commandindex compare1 [dict-stride keys2 -command compare $map]encoding] {
setputs "$char \t[dict get $mapcharcount $keychar]\t$huffCode"
puts "$char\t[dict get $charcount $char]\t$key"
}
}
$pq destroy
return $encoding
Line 3,629 ⟶ 7,399:
 
puts $str
puts [string map $encoding $str]</langsyntaxhighlight>
{{out}}
outputs
<pre style='width:full; overflow:scroll'>n 4 000
6 101
Line 3,652 ⟶ 7,422:
this is an example for huffman encoding
0101011111100100101011001001010111000001011101010111100001101100011011101101111001000111010111111011111110111000111100000101110100010000010010001100100011110</pre>
 
=={{header|UNIX Shell}}==
{{works with|Bourne Again SHell}}
<syntaxhighlight lang="bash">
#!/bin/bash
 
set -eu
 
# make scratch directory
t="$(mktemp -d)"
cd "${t:?mktemp failed}"
trap 'rm -r -- "$t"' EXIT
 
# get character frequencies
declare -a freq=()
while read addr line; do
for c in $line; do
: $((freq[8#$c]++))
done
done < <(od -b -v)
 
# convert freqs into a bucket queue
declare -i i=0
for c in ${!freq[@]}; do
fn="${freq[c]}.$((i++))"
echo "$c:${freq[c]}" >"$fn"
done
 
top2() { ls | sort -t. -k1,1n -k2,2n | sed 2q; }
set -- $(top2)
while [[ $# -gt 1 ]]; do
declare -i l="${1%%.*}" r="${2%%.*}" # combine weights into
fn="$((l + r)).$((i++))" # ... new node weight
mkdir "$fn"
mv "$1" "$fn/0"
mv "$2" "$fn/1"
set -- $(top2)
done
 
echo -e "Symbol\tWeight\tHuffman Code"
cd "$fn"
find . -type f -exec grep . {} + |
tr -d ./ |
awk -F: '{printf "%c\t%d\t%s\n", $2, $3, $1}' |
sort -k 2,2nr -k 3,3n
</syntaxhighlight>
 
=={{header|Ursala}}==
following the algorithm given above
<langsyntaxhighlight Ursalalang="ursala">#import std
#import nat
#import flo
Line 3,668 ⟶ 7,484:
#cast %csAL
 
table = code_table 'this is an example for huffman encoding'</langsyntaxhighlight>
a quick walk through the code starting from the bottom:
* <code>*K2 ^/~&h float+ length</code> compute character frequencies by partitioning the input list of characters by equality, and transforming each equivalence class to a pair containing its member and its cardinality represented as a floating point number
Line 3,680 ⟶ 7,496:
* <code>~&plrDSLrnPlrmPCAS/'01'</code> otherwise there are two subtrees, hence two lists previously computed results propagating upward, so insert a zero into all of the bit strings in the results on the left, and a one into all the ones on the right, concatenate the left and right results, and propagate the contatenation upward
 
{{out}}
output:
<pre><
`r: '00000',
Line 3,701 ⟶ 7,517:
`e: '1101',
` : '111'></pre>
 
=={{header|Wren}}==
{{trans|Java}}
{{libheader|Wren-queue}}
Note that the results differ from Java because the PriorityQueue implementation is not the same.
<syntaxhighlight lang="wren">import "./queue" for PriorityQueue
 
class HuffmanTree {
construct new(freq) {
_freq = freq
}
freq { _freq }
compareTo(tree) { _freq - tree.freq }
}
 
class HuffmanLeaf is HuffmanTree {
construct new (freq, val) {
super(freq)
_val = val
}
val { _val }
}
 
class HuffmanNode is HuffmanTree {
construct new(l, r) {
super(l.freq + r.freq)
_left = l
_right = r
}
left { _left }
right { _right }
}
 
var buildTree = Fn.new { |charFreqs|
var trees = PriorityQueue.new()
var index = 0
for (freq in charFreqs) {
if (freq > 0) trees.push(HuffmanLeaf.new(freq, String.fromByte(index)), -freq)
index = index + 1
}
if (trees.count == 0) Fiber.abort("Something went wrong!")
while (trees.count > 1) {
var a = trees.pop()
var b = trees.pop()
var h = HuffmanNode.new(a[0], b[0])
trees.push(h, -h.freq)
}
return trees.pop()[0]
}
 
var printCodes // recursive
printCodes = Fn.new { |tree, prefix|
if (tree is HuffmanLeaf) {
System.print("%(tree.val)\t%(tree.freq)\t%(prefix)")
} else if (tree is HuffmanNode) {
// traverse left
prefix = prefix + "0"
printCodes.call(tree.left, prefix)
prefix = prefix[0...-1]
// traverse right
prefix = prefix + "1"
printCodes.call(tree.right, prefix)
prefix = prefix[0...-1]
}
}
 
var test = "this is an example for huffman encoding"
 
var freqs = List.filled(256, 0)
for (c in test) {
var ix = c.bytes[0]
freqs[ix] = freqs[ix] + 1
}
 
var tree = buildTree.call(freqs)
System.print("SYMBOL\tWEIGHT\tHUFFMAN CODE")
printCodes.call(tree, "")</syntaxhighlight>
 
{{out}}
<pre>
SYMBOL WEIGHT HUFFMAN CODE
n 4 000
m 2 0010
o 2 0011
s 2 0100
c 1 01010
d 1 01011
g 1 01100
l 1 01101
p 1 01110
r 1 01111
t 1 10000
u 1 10001
a 3 1001
6 101
e 3 1100
f 3 1101
i 3 1110
x 1 11110
h 2 11111
</pre>
 
=={{header|Zig}}==
 
{{trans|Rust}}
 
<syntaxhighlight lang="zig">
const std = @import("std");
 
const Node = struct {
frequency: usize,
kind: union(enum) {
internal: struct {
left: *Node,
right: *Node,
},
leaf: u8,
},
 
fn initLeaf(frequency: usize, ch: u8) Node {
return .{
.frequency = frequency,
.kind = .{ .leaf = ch },
};
}
 
fn initInternal(
allocator: std.mem.Allocator,
left_child: Node,
right_child: Node,
) !Node {
const left = try allocator.create(Node);
const right = try allocator.create(Node);
left.* = left_child;
right.* = right_child;
return .{
.frequency = left_child.frequency + right_child.frequency,
.kind = .{ .internal = .{ .left = left, .right = right } },
};
}
 
fn deinit(self: Node, allocator: std.mem.Allocator) void {
switch (self.kind) {
.internal => |inner| {
inner.left.deinit(allocator);
inner.right.deinit(allocator);
allocator.destroy(inner.left);
allocator.destroy(inner.right);
},
.leaf => {},
}
}
 
fn compare(context: void, a: Node, b: Node) std.math.Order {
_ = context;
return std.math.order(a.frequency, b.frequency);
}
};
 
pub fn main() !void {
const text = "this is an example for huffman encoding";
 
var gpa = std.heap.GeneralPurposeAllocator(.{}){};
defer std.debug.assert(gpa.deinit() == .ok);
 
const allocator = gpa.allocator();
var frequencies = std.AutoHashMap(u8, usize).init(allocator);
defer frequencies.deinit();
 
for (text) |ch| {
const gop = try frequencies.getOrPut(ch);
if (gop.found_existing) {
gop.value_ptr.* += 1;
} else {
gop.value_ptr.* = 1;
}
}
 
var prioritized_frequencies =
std.PriorityQueue(Node, void, Node.compare).init(allocator, {});
defer prioritized_frequencies.deinit();
 
var freq_it = frequencies.iterator();
while (freq_it.next()) |counted_char| {
try prioritized_frequencies.add(Node.initLeaf(
counted_char.value_ptr.*,
counted_char.key_ptr.*,
));
}
 
while (prioritized_frequencies.len > 1) {
try prioritized_frequencies.add(try Node.initInternal(
allocator,
prioritized_frequencies.remove(),
prioritized_frequencies.remove(),
));
}
 
const root = prioritized_frequencies.items[0];
defer root.deinit(allocator);
 
var codes = std.AutoArrayHashMap(u8, []const u8).init(allocator);
defer codes.deinit();
 
var arena = std.heap.ArenaAllocator.init(allocator);
defer arena.deinit();
 
try generateCodes(arena.allocator(), &root, &.{}, &codes);
 
const stdout = std.io.getStdOut().writer();
var code_it = codes.iterator();
while (code_it.next()) |item| {
try stdout.print("{c}: {s}\n", .{
item.key_ptr.*,
item.value_ptr.*,
});
}
}
 
fn generateCodes(
arena: std.mem.Allocator,
node: *const Node,
prefix: []const u8,
out_codes: *std.AutoArrayHashMap(u8, []const u8),
) !void {
switch (node.kind) {
.internal => |inner| {
const left_prefix = try arena.alloc(u8, prefix.len + 1);
std.mem.copy(u8, left_prefix, prefix);
left_prefix[prefix.len] = '0';
try generateCodes(arena, inner.left, left_prefix, out_codes);
 
const right_prefix = try arena.alloc(u8, prefix.len + 1);
std.mem.copy(u8, right_prefix, prefix);
right_prefix[prefix.len] = '1';
try generateCodes(arena, inner.right, right_prefix, out_codes);
},
.leaf => |ch| {
try out_codes.put(ch, prefix);
},
}
}
</syntaxhighlight>
{{out}}
<pre>
n: 000
m: 0010
d: 00110
t: 00111
o: 0100
h: 0101
x: 01100
r: 01101
c: 01110
u: 01111
s: 1000
g: 10010
p: 100110
l: 100111
a: 1010
i: 1011
: 110
e: 1110
f: 1111
</pre>
 
=={{header|zkl}}==
This code was adapted from Perl, Python and most of the other examples.
<syntaxhighlight lang="zkl">fcn buildHuffman(text){ //-->(encode dictionary, decode dictionary)
ft:=Dictionary();
foreach c in (text){ ft[c]=ft.find(c,0)+1 } // leafs w/count
 
// build the tree, which is a list of lists of ...
tree:=ft.pump(List,fcn([(c,cnt)]){ //-->L( (cnt, ((sym,code))), ...)
L(cnt, L(L(c,"")))
}).copy(); // make it writable
 
while(tree.len()>1){ // fake up a [lame] priorty queue
tree=tree.sort(fcn(a,b){ a[0]>b[0] }); //prioritize high to low
a,b:=tree.pop(-2,2); //remove 2 least frequent symbols
mc:=fcn(n,c){ n[1] = c + n[1]; }; //(sym,code),"0"|"1"
a[1].apply2(mc,"0"); b[1].apply2(mc,"1"); // mc(a[1],"0")
tree.append( L(a[0]+b[0],a[1].extend(b[1])) ); //(a,b)-->new node
}//-->L(L(39, L( L(" ","000"),L("e","0010"),L("a","0011") ...
 
tree=tree[0][1].pump(List,fcn(i){ // flatten rather than traverse
if(T.isType(i))return(Void.Recurse,i,self.fcn); i });
encodeTable:=tree.toDictionary(); // symbol:Huffman code
decodeTable:=encodeTable.pump(Dictionary(),"reverse"); // code:symbol
return(encodeTable,decodeTable);
}</syntaxhighlight>
<syntaxhighlight lang="zkl">fcn encode(text,table){ text.pump(String,table.get) }
fcn decode(bits,table){ // this is a horrible decoder, for testing only
w:=bits.walker(); sink:=Sink(String);
try{ s:=""; while(1){
s+=w.next(); if(c:=table.find(s)) { sink.write(c); s=""; }
}}catch(TheEnd){}
sink.close();
}</syntaxhighlight>
<syntaxhighlight lang="zkl">text:="this is an example for huffman encoding";
encodeTable,decodeTable := buildHuffman(text);
encodeTable.pump(Console.println,fcn(kv){"%s : %s".fmt(kv.xplode())});
 
e:=encode(text,encodeTable);
"Encode %d characters (%d bits) to %d bits (%d bytes):"
.fmt(text.len(),text.len()*8,e.len(),(e.len()+7)/8).println();
println(e);
 
0'|Bits decoded to: "%s"|.fmt(decode(e,decodeTable)).println();</syntaxhighlight>
{{out}}
<pre>
a : 0011
c : 10101
d : 10100
e : 0010
f : 0110
g : 10111
h : 1000
i : 0101
l : 10110
m : 1001
n : 110
o : 01000
p : 11111
r : 11100
s : 0111
t : 01001
u : 11101
x : 11110
: 000
Encode 39 characters (312 bits) to 157 bits (20 bytes):
0100110000101011100001010111000001111000000101111000111001111111011000100000110010001110000010001110101100110100100111100000010110101010100010100010111010111
Bits decoded to: "this is an example for huffman encoding"
</pre>
 
 
{{omit from|Brlcad}}
9,482

edits