Hash join
The classic hash join algorithm for an inner join of two relations has the following steps:
- Hash phase: Create a hash table for one of the two relations by applying a hash function to the join attribute of each row. Ideally we should create a hash table for the smaller relation, thus optimizing for creation time and memory size of the hash table.
- Join phase: Scan the larger relation and find the relevant rows by looking in the hash table created before.
![Task](http://static.miraheze.org/rosettacodewiki/thumb/b/ba/Rcode-button-task-crushed.png/64px-Rcode-button-task-crushed.png)
You are encouraged to solve this task according to the task description, using any language you may know.
The algorithm is as follows:
for each tuple s in S do let h = hash on join attributes s(b) place s in hash table Sh in bucket keyed by hash value h for each tuple r in R do let h = hash on join attributes r(a) if h indicates a nonempty bucket (B) of hash table Sh if h matches any s in B concatenate r and s place relation in Q
Task: implement the Hash Join algorithm and show the result of joining two tables with it. You should use your implementation to show the joining of these tables:
|
|
Bracmat
This solution creates a hash table for the smaller relation in the function join
. This function takes as arguments the smallest table, the biggest table and then three pieces of code: two patterns that describe each table's field order and code that generates one row of output. These pieces of code are inserted in a fixed skeleton of code using macro substitution.
<lang bracmat>( (27.Jonah)
(18.Alan) (28.Glory) (18.Popeye) (28.Alan) : ?table-A
& (Jonah.Whales)
(Jonah.Spiders) (Alan.Ghosts) (Alan.Zombies) (Glory.Buffy) : ?table-B
& new$hash:?H & !table-A:? [?lenA & !table-B:? [?lenB & ( join
= smalltab bigtab smallschema bigschema joinschema , key val val2 keyval2 . !arg : (?smalltab.?bigtab.(=?smallschema.?bigschema.?joinschema)) & :?rel & !( ' ( whl ' ( !smalltab:$smallschema ?smalltab & (H..insert)$(!key.!val) ) & whl ' ( !bigtab:$bigschema ?bigtab & ( (H..find)$!key:?keyval2 & whl ' ( !keyval2:(?key.?val2) ?keyval2 & $joinschema !rel:?rel ) | ) ) ) ) & !rel )
& out
$ ( join $ ( !lenA:~<!lenB & ( !table-B . !table-A . ( = (?key.?val).(?val.?key).!val.!key.!val2 ) ) | ( !table-A . !table-B . (=(?val.?key).(?key.?val).!val2.!key.!val) ) ) )
& );</lang> Output:
(28.Alan.Ghosts) (28.Alan.Zombies) (28.Glory.Buffy) (18.Alan.Ghosts) (18.Alan.Zombies) (27.Jonah.Whales) (27.Jonah.Spiders)
C#
- using LINQ to Objects
<lang csharp>using System; using System.Collections.Generic; using System.Linq;
namespace HashJoin {
public class AgeName { public AgeName(byte age, string name) { Age = age; Name = name; } public byte Age { get; private set; } public string Name { get; private set; } }
public class NameNemesis { public NameNemesis(string name, string nemesis) { Name = name; Nemesis = nemesis; } public string Name { get; private set; } public string Nemesis { get; private set; } }
public class DataContext { public DataContext() { AgeName = new List<AgeName>(); NameNemesis = new List<NameNemesis>(); } public List<AgeName> AgeName { get; set; } public List<NameNemesis> NameNemesis { get; set; } }
public class AgeNameNemesis { public AgeNameNemesis(byte age, string name, string nemesis) { Age = age; Name = name; Nemesis = nemesis; } public byte Age { get; private set; } public string Name { get; private set; } public string Nemesis { get; private set; } }
class Program { public static void Main() { var data = GetData(); var result = ExecuteHashJoin(data); WriteResultToConsole(result); }
private static void WriteResultToConsole(List<AgeNameNemesis> result) { result.ForEach(ageNameNemesis => Console.WriteLine("Age: {0}, Name: {1}, Nemesis: {2}", ageNameNemesis.Age, ageNameNemesis.Name, ageNameNemesis.Nemesis)); }
private static List<AgeNameNemesis> ExecuteHashJoin(DataContext data) { return (data.AgeName.Join(data.NameNemesis, ageName => ageName.Name, nameNemesis => nameNemesis.Name, (ageName, nameNemesis) => new AgeNameNemesis(ageName.Age, ageName.Name, nameNemesis.Nemesis))) .ToList(); }
private static DataContext GetData() { var context = new DataContext();
context.AgeName.AddRange(new [] { new AgeName(27, "Jonah"), new AgeName(18, "Alan"), new AgeName(28, "Glory"), new AgeName(18, "Popeye"), new AgeName(28, "Alan") });
context.NameNemesis.AddRange(new[] { new NameNemesis("Jonah", "Whales"), new NameNemesis("Jonah", "Spiders"), new NameNemesis("Alan", "Ghosts"), new NameNemesis("Alan", "Zombies"), new NameNemesis("Glory", "Buffy") });
return context; } }
}</lang>
- Output:
Age: 27, Name: Jonah, Nemesis: Whales Age: 27, Name: Jonah, Nemesis: Spiders Age: 18, Name: Alan, Nemesis: Ghosts Age: 18, Name: Alan, Nemesis: Zombies Age: 28, Name: Glory, Nemesis: Buffy Age: 28, Name: Alan, Nemesis: Ghosts Age: 28, Name: Alan, Nemesis: Zombies
Common Lisp
<lang lisp>(defparameter *table-A* '((27 "Jonah") (18 "Alan") (28 "Glory") (18 "Popeye") (28 "Alan")))
(defparameter *table-B* '(("Jonah" "Whales") ("Jonah" "Spiders") ("Alan" "Ghosts") ("Alan" "Zombies") ("Glory" "Buffy")))
- Hash phase
(defparameter *hash-table* (make-hash-table :test #'equal))
(loop for (i r) in *table-A*
for value = (gethash r *hash-table* (list nil)) do (setf (gethash r *hash-table*) value) (push (list i r) (first value)))
- Join phase
(loop for (i r) in *table-B* do
(let ((val (car (gethash i *hash-table*)))) (loop for (a b) in val do
(format t "{~a ~a} {~a ~a}~%" a b i r))))</lang>
- Output:
{27 Jonah} {Jonah Whales} {27 Jonah} {Jonah Spiders} {28 Alan} {Alan Ghosts} {18 Alan} {Alan Ghosts} {28 Alan} {Alan Zombies} {18 Alan} {Alan Zombies} {28 Glory} {Glory Buffy}
D
<lang d>import std.stdio, std.typecons;
auto hashJoin(size_t index1, size_t index2, T1, T2)
(in T1[] table1, in T2[] table2) pure /*nothrow*/
if (is(typeof(T1[index1]) == typeof(T2[index2]))) {
T1[][typeof(T1[index1])] h; // Hash phase. foreach (s; table1) h[s[index1]] ~= s; // Join phase. Tuple!(T1, const T2)[] result; foreach (r; table2) foreach (s; h.get(r[index2], [])) result ~= tuple(s, r); return result;
}
void main() {
alias T = tuple; immutable table1 = [T(27, "Jonah"), T(18, "Alan"), T(28, "Glory"), T(18, "Popeye"), T(28, "Alan")]; immutable table2 = [T("Jonah", "Whales"), T("Jonah", "Spiders"), T("Alan", "Ghosts"), T("Alan", "Zombies"), T("Glory", "Buffy")];
foreach (row; hashJoin!(1, 0)(table1, table2)) writefln("(%s, %5s) (%5s, %7s)", row[0][], row[1][]);
}</lang>
- Output:
(27, Jonah) (Jonah, Whales) (27, Jonah) (Jonah, Spiders) (18, Alan) ( Alan, Ghosts) (28, Alan) ( Alan, Ghosts) (18, Alan) ( Alan, Zombies) (28, Alan) ( Alan, Zombies) (28, Glory) (Glory, Buffy)
Erlang
<lang Erlang> -module( hash_join ).
-export( [task/0] ).
task() ->
Table_1 = [{27, "Jonah"}, {18, "Alan"}, {28, "Glory"}, {18, "Popeye"}, {28, "Alan"}], Table_2 = [{"Jonah", "Whales"}, {"Jonah", "Spiders"}, {"Alan", "Ghosts"}, {"Alan", "Zombies"}, {"Glory", "Buffy"}], Dict = lists:foldl( fun dict_append/2, dict:new(), Table_1 ), lists:flatten( [dict_find( X, Dict ) || X <- Table_2] ).
dict_append( {Key, Value}, Acc ) -> dict:append( Value, {Key, Value}, Acc ).
dict_find( {Key, Value}, Dict ) -> dict_find( dict:find(Key, Dict), Key, Value ).
dict_find( error, _Key, _Value ) -> []; dict_find( {ok, Values}, Key, Value ) -> [{X, {Key, Value}} || X <- Values]. </lang>
- Output:
15> hash_join:task(). [{{27,"Jonah"},{"Jonah","Whales"}}, {{27,"Jonah"},{"Jonah","Spiders"}}, {{18,"Alan"},{"Alan","Ghosts"}}, {{28,"Alan"},{"Alan","Ghosts"}}, {{18,"Alan"},{"Alan","Zombies"}}, {{28,"Alan"},{"Alan","Zombies"}}, {{28,"Glory"},{"Glory","Buffy"}}]
Go
<lang go>package main
import "fmt"
func main() {
tableA := []struct { value int key string }{ {27, "Jonah"}, {18, "Alan"}, {28, "Glory"}, {18, "Popeye"}, {28, "Alan"}, } tableB := []struct { key string value string }{ {"Jonah", "Whales"}, {"Jonah", "Spiders"}, {"Alan", "Ghosts"}, {"Alan", "Zombies"}, {"Glory", "Buffy"}, } // hash phase h := map[string][]int{} for i, r := range tableA { h[r.key] = append(h[r.key], i) } // join phase for _, x := range tableB { for _, a := range h[x.key] { fmt.Println(tableA[a], x) } }
}</lang>
- Output:
{27 Jonah} {Jonah Whales} {27 Jonah} {Jonah Spiders} {18 Alan} {Alan Ghosts} {28 Alan} {Alan Ghosts} {18 Alan} {Alan Zombies} {28 Alan} {Alan Zombies} {28 Glory} {Glory Buffy}
Haskell
The ST monad allows us to utilise mutable memory behind a referentially transparent interface, allowing us to use hashtables (efficiently).
Our hashJoin function takes two lists and two selector functions.
Placing all relations with the same selector value in a list in the hashtable allows us to join many to one/many relations. <lang Haskell>{-# LANGUAGE LambdaCase, TupleSections #-} import qualified Data.HashTable.ST.Basic as H import Data.Hashable import Control.Monad.ST import Control.Monad import Data.STRef
hashJoin :: (Eq k, Hashable k) =>
[t] -> (t -> k) -> [a] -> (a -> k) -> [(t, a)]
hashJoin xs fx ys fy = runST $ do
l <- newSTRef [] ht <- H.new forM_ ys $ \y -> H.insert ht (fy y) =<< (H.lookup ht (fy y) >>= \case Nothing -> return [y] Just v -> return (y:v)) forM_ xs $ \x -> do H.lookup ht (fx x) >>= \case Nothing -> return () Just v -> modifySTRef' l ((map (x,) v) ++) readSTRef l
test = mapM_ print $ hashJoin
[(1, "Jonah"), (2, "Alan"), (3, "Glory"), (4, "Popeye")] snd [("Jonah", "Whales"), ("Jonah", "Spiders"), ("Alan", "Ghosts"), ("Alan", "Zombies"), ("Glory", "Buffy")] fst
</lang>
λ> test ((3,"Glory"),("Glory","Buffy")) ((2,"Alan"),("Alan","Zombies")) ((2,"Alan"),("Alan","Ghosts")) ((1,"Jonah"),("Jonah","Spiders")) ((1,"Jonah"),("Jonah","Whales"))
The task require hashtables; however, a cleaner and more functional solution would be to use Data.Map (based on binary trees): <lang Haskell>{-# LANGUAGE TupleSections #-} import qualified Data.Map as M import Data.List import Data.Maybe import Control.Applicative
mapJoin xs fx ys fy = joined
where yMap = foldl' f M.empty ys f m y = M.insertWith (++) (fy y) [y] m joined = concat . catMaybes . map (\x -> map (x,) <$> M.lookup (fx x) yMap) $ xs
test = mapM_ print $ mapJoin
[(1, "Jonah"), (2, "Alan"), (3, "Glory"), (4, "Popeye")] snd [("Jonah", "Whales"), ("Jonah", "Spiders"), ("Alan", "Ghosts"), ("Alan", "Zombies"), ("Glory", "Buffy")] fst
</lang>
λ> test ((1,"Jonah"),("Jonah","Spiders")) ((1,"Jonah"),("Jonah","Whales")) ((2,"Alan"),("Alan","Zombies")) ((2,"Alan"),("Alan","Ghosts")) ((3,"Glory"),("Glory","Buffy"))
OCaml
This exploits the fact that Hashtbl implements a hash table that can store multiple values for a key, for an especially simple solution: <lang ocaml>let hash_join table1 f1 table2 f2 =
let h = Hashtbl.create 42 in (* hash phase *) List.iter (fun s -> Hashtbl.add h (f1 s) s) table1; (* join phase *) List.concat (List.map (fun r -> List.map (fun s -> s, r) (Hashtbl.find_all h (f2 r))) table2)</lang>
Sample run:
# let table1 = [27, "Jonah"; 18, "Alan"; 28, "Glory"; 18, "Popeye"; 28, "Alan"];; # let table2 = ["Jonah", "Whales"; "Jonah", "Spiders"; "Alan", "Ghosts"; "Alan", "Zombies"; "Glory", "Buffy"];; # hash_join table1 snd table2 fst;; - : ((int * string) * (string * string)) list = [((27, "Jonah"), ("Jonah", "Whales")); ((27, "Jonah"), ("Jonah", "Spiders")); ((28, "Alan"), ("Alan", "Ghosts")); ((18, "Alan"), ("Alan", "Ghosts")); ((28, "Alan"), ("Alan", "Zombies")); ((18, "Alan"), ("Alan", "Zombies")); ((28, "Glory"), ("Glory", "Buffy"))]
Perl
<lang perl>use Data::Dumper qw(Dumper);
sub hashJoin {
my ($table1, $index1, $table2, $index2) = @_; my %h; # hash phase foreach my $s (@$table1) {
push @{ $h{$s->[$index1]} }, $s;
} # join phase map { my $r = $_;
map [$_, $r], @{ $h{$r->[$index2]} }
} @$table2;
}
@table1 = ([27, "Jonah"],
[18, "Alan"], [28, "Glory"], [18, "Popeye"], [28, "Alan"]);
@table2 = (["Jonah", "Whales"],
["Jonah", "Spiders"], ["Alan", "Ghosts"], ["Alan", "Zombies"], ["Glory", "Buffy"]);
$Data::Dumper::Indent = 0; foreach my $row (hashJoin(\@table1, 1, \@table2, 0)) {
print Dumper($row), "\n";
}</lang>
- Output:
$VAR1 = [[27,'Jonah'],['Jonah','Whales']]; $VAR1 = [[27,'Jonah'],['Jonah','Spiders']]; $VAR1 = [[18,'Alan'],['Alan','Ghosts']]; $VAR1 = [[28,'Alan'],['Alan','Ghosts']]; $VAR1 = [[18,'Alan'],['Alan','Zombies']]; $VAR1 = [[28,'Alan'],['Alan','Zombies']]; $VAR1 = [[28,'Glory'],['Glory','Buffy']];
Perl 6
<lang perl6>my @A = [1, "Jonah"],
[2, "Alan"], [3, "Glory"], [4, "Popeye"];
my @B = ["Jonah", "Whales"],
["Jonah", "Spiders"], ["Alan", "Ghosts"], ["Alan", "Zombies"], ["Glory", "Buffy"];
sub hash-join(@a, &a, @b, &b) {
my %hash{Any}; %hash{.&a} = $_ for @a; ([%hash{.&b} // next, $_] for @b);
}
.perl.say for hash-join @A, *.[1], @B, *.[0];</lang>
- Output:
[[1, "Jonah"], ["Jonah", "Whales"]] [[1, "Jonah"], ["Jonah", "Spiders"]] [[2, "Alan"], ["Alan", "Ghosts"]] [[2, "Alan"], ["Alan", "Zombies"]] [[3, "Glory"], ["Glory", "Buffy"]]
PHP
<lang php><?php function hashJoin($table1, $index1, $table2, $index2) {
// hash phase foreach ($table1 as $s) $h[$s[$index1]][] = $s; // join phase foreach ($table2 as $r) foreach ($h[$r[$index2]] as $s)
$result[] = array($s, $r);
return $result;
}
$table1 = array(array(27, "Jonah"),
array(18, "Alan"), array(28, "Glory"), array(18, "Popeye"), array(28, "Alan"));
$table2 = array(array("Jonah", "Whales"),
array("Jonah", "Spiders"), array("Alan", "Ghosts"), array("Alan", "Zombies"), array("Glory", "Buffy"), array("Bob", "foo"));
foreach (hashJoin($table1, 1, $table2, 0) as $row)
print_r($row);
?></lang>
- Output:
Array ( [0] => Array ( [0] => 27 [1] => Jonah ) [1] => Array ( [0] => Jonah [1] => Whales ) ) Array ( [0] => Array ( [0] => 27 [1] => Jonah ) [1] => Array ( [0] => Jonah [1] => Spiders ) ) Array ( [0] => Array ( [0] => 18 [1] => Alan ) [1] => Array ( [0] => Alan [1] => Ghosts ) ) Array ( [0] => Array ( [0] => 28 [1] => Alan ) [1] => Array ( [0] => Alan [1] => Ghosts ) ) Array ( [0] => Array ( [0] => 18 [1] => Alan ) [1] => Array ( [0] => Alan [1] => Zombies ) ) Array ( [0] => Array ( [0] => 28 [1] => Alan ) [1] => Array ( [0] => Alan [1] => Zombies ) ) Array ( [0] => Array ( [0] => 28 [1] => Glory ) [1] => Array ( [0] => Glory [1] => Buffy ) )
Python
<lang python>from collections import defaultdict
def hashJoin(table1, index1, table2, index2):
h = defaultdict(list) # hash phase for s in table1: h[s[index1]].append(s) # join phase return [(s, r) for r in table2 for s in h[r[index2]]]
table1 = [(27, "Jonah"),
(18, "Alan"), (28, "Glory"), (18, "Popeye"), (28, "Alan")]
table2 = [("Jonah", "Whales"),
("Jonah", "Spiders"), ("Alan", "Ghosts"), ("Alan", "Zombies"), ("Glory", "Buffy")]
for row in hashJoin(table1, 1, table2, 0):
print(row)</lang>
- Output:
((27, 'Jonah'), ('Jonah', 'Whales')) ((27, 'Jonah'), ('Jonah', 'Spiders')) ((18, 'Alan'), ('Alan', 'Ghosts')) ((28, 'Alan'), ('Alan', 'Ghosts')) ((18, 'Alan'), ('Alan', 'Zombies')) ((28, 'Alan'), ('Alan', 'Zombies')) ((28, 'Glory'), ('Glory', 'Buffy'))
Racket
<lang racket>#lang racket (struct A (age name)) (struct B (name nemesis)) (struct AB (name age nemesis) #:transparent)
(define Ages-table
(list (A 27 "Jonah") (A 18 "Alan") (A 28 "Glory") (A 18 "Popeye") (A 28 "Alan")))
(define Nemeses-table
(list (B "Jonah" "Whales") (B "Jonah" "Spiders") (B "Alan" "Ghosts") (B "Alan" "Zombies") (B "Glory" "Buffy")))
- Hash phase
(define name->ages#
(for/fold ((rv (hash))) ((a (in-list Ages-table))) (match-define (A age name) a) (hash-update rv name (λ (ages) (append ages (list age))) null)))
- Join phase
(for*/list
((b (in-list Nemeses-table)) (key (in-value (B-name b))) (age (in-list (hash-ref name->ages# key)))) (AB key age (B-nemesis b)))</lang>
- Output:
(#(struct:AB "Jonah" 27 "Whales")
#(struct:AB "Jonah" 27 "Spiders") #(struct:AB "Alan" 18 "Ghosts") #(struct:AB "Alan" 28 "Ghosts") #(struct:AB "Alan" 18 "Zombies") #(struct:AB "Alan" 28 "Zombies") #(struct:AB "Glory" 28 "Buffy"))
REXX
Programming note: It wasn't clear whether or not Popeye should be listed (it has no nemesis associated with him),
so a clumsy if statement was left in the code, effectively making the next if statement as effective as a comment:
if 1==00000 then
<lang rexx>/*REXX pgm demonstrates the classic hash join algorithm for 2 relations.*/
S. = ; R. = S.1 = 27 'Jonah' ; R.1 = 'Jonah Whales' S.2 = 18 'Alan' ; R.2 = 'Jonah Spiders' S.3 = 28 'Glory' ; R.3 = 'Alan Ghosts' S.4 = 18 'Popeye' ; R.4 = 'Alan Zombies' S.5 = 28 'Alan' ; R.5 = 'Glory Buffy'
hash.= /*initialize the hash table. */
do #=1 while S.#\==; parse var S.# age name /*extract info*/ hash.name=hash.name # /*build a hash table entry. */ end /*#*/ /* [↑] REXX does the heavy work.*/
- =#-1 /*adjust for DO loop (#) overage.*/
do j=1 while R.j\== /*process a nemesis for a name. */ parse var R.j x nemesis /*extract name and it's nemesis. */ if hash.x== then do /*Not in hash? Then a new name.*/ #=#+1 /*bump the number of S entries. */ S.#=',' x /*add new name to the S table. */ hash.x=# /*add new name to the hash table.*/ end /* [↑] this DO isn't used today.*/ do k=1 for words(hash.x); _=word(hash.x,k) /*get pointer.*/ S._=S._ nemesis /*add nemesis──► applicable hash.*/ end /*k*/ end /*j*/
_='─' /*character used for separater. */ pad=left(,6-2) /*spacing used in hdr/sep/output.*/ say pad center('age',3) pad center('name',20) pad center('nemesis',30) say pad center('───',3) pad center( ,20,_) pad center( ,30,_)
do n=1 for #; parse var S.n age name nems /*get info. */
if 1==00000 then /*one method to never XEQ code ↓ */
if nems== then iterate /*if no nemesis, then don't show.*/ say pad right(age,3) pad center(name,20) pad nems /*show an S.*/ end /*n*/ /*stick a fork in it, we're done.*/</lang>
output using the in-code relations (data):
age name nemesis ─── ──────────────────── ────────────────────────────── 27 Jonah Whales Spiders 18 Alan Ghosts Zombies 28 Glory Buffy 18 Popeye 28 Alan Ghosts Zombies
Ruby
<lang ruby>def hashJoin(table1, index1, table2, index2)
# hash phase h = table1.group_by {|s| s[index1]} h.default = [] # join phase table2.collect {|r| h[r[index2]].collect {|s| [s, r]} }.flatten(1)
end
table1 = [[27, "Jonah"],
[18, "Alan"], [28, "Glory"], [18, "Popeye"], [28, "Alan"]]
table2 = [["Jonah", "Whales"],
["Jonah", "Spiders"], ["Alan", "Ghosts"], ["Alan", "Zombies"], ["Glory", "Buffy"]]
hashJoin(table1, 1, table2, 0).each { |row| p row }</lang>
- Output:
[[27, "Jonah"], ["Jonah", "Whales"]] [[27, "Jonah"], ["Jonah", "Spiders"]] [[18, "Alan"], ["Alan", "Ghosts"]] [[28, "Alan"], ["Alan", "Ghosts"]] [[18, "Alan"], ["Alan", "Zombies"]] [[28, "Alan"], ["Alan", "Zombies"]] [[28, "Glory"], ["Glory", "Buffy"]]
Tcl
Tcl uses hash tables to implement both its associative arrays and its dictionaries. <lang tcl>package require Tcl 8.6
- Only for lmap, which can be replaced with foreach
proc joinTables {tableA indexA tableB indexB} {
# Optimisation: if the first table is longer, do in reverse order if {[llength $tableB] < [llength $tableA]} {
return [lmap pair [joinTables $tableB $indexB $tableA $indexA] { lreverse $pair }]
}
foreach value $tableA {
set hashmap([lindex $value $indexA]) $value #dict version# dict set hashmap [lindex $value $indexA] $value
} lmap value $tableB {
set key [lindex $value $indexB] if {![info exist hashmap($key)]} continue #dict version# if {![dict exists $hashmap $key]} continue list $hashmap($key) $value #dict version# list [dict get $hashmap $key] $value
}
}
set tableA {
{27 "Jonah"} {18 "Alan"} {28 "Glory"} {18 "Popeye"} {28 "Alan"}
} set tableB {
{"Jonah" "Whales"} {"Jonah" "Spiders"} {"Alan" "Ghosts"} {"Alan" "Zombies"} {"Glory" "Buffy"}
} set joined [joinTables $tableA 1 $tableB 0] foreach row $joined {
puts $row
}</lang>
- Output:
{27 "Jonah"} {"Jonah" "Whales"} {27 "Jonah"} {"Jonah" "Spiders"} {28 "Alan"} {"Alan" "Ghosts"} {28 "Alan"} {"Alan" "Zombies"} {28 "Glory"} {"Glory" "Buffy"}
TXR
Generic hash join. Arguments left-key
and right-key
are functions applied to the elements of the left
and right
sequences to retrieve the join key.
<lang txr>@(do
(defvar age-name '((27 Jonah) (18 Alan) (28 Glory) (18 Popeye) (28 Alan)))
(defvar nemesis-name '((Jonah Whales) (Jonah Spiders) (Alan Ghosts) (Alan Zombies) (Glory Buffy)))
(defun hash-join (left left-key right right-key) (let ((join-hash [group-by left-key left])) ;; hash phase (append-each ((r-entry right)) ;; join phase (collect-each ((l-entry [join-hash [right-key r-entry]])) '(,l-entry ,r-entry)))))
(format t "~s\n" [hash-join age-name second nemesis-name first]))
</lang>
Run:
$ txr hash-join.txr (((27 Jonah) (Jonah Whales)) ((27 Jonah) (Jonah Spiders)) ((18 Alan) (Alan Ghosts)) ((28 Alan) (Alan Ghosts)) ((18 Alan) (Alan Zombies)) ((28 Alan) (Alan Zombies)) ((28 Glory) (Glory Buffy)))