Hash join

Revision as of 17:00, 27 November 2013 by rosettacode>Paddy3118 (To draft task status)

Hash Join

Hash join is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.

The classic hash join algorithm for an inner join of two relations has the following steps:

  • Hash phase : Creating a hash table for one of the two relations by applying a hash function to the join attribute of each row. Ideally we should create a hash table for the smaller relation. Thus, optimizing for creation time and memory size of the hash table.
  • Join phase : Scanning the larger relation and finding the relevant rows by looking in the hash table created before.

The algorithm is as follows:

for each tuple s in S do
{ hash on join attributes s(b)
  place tuples in hash table based on hash values};
for each tuple r do
{ hash on join attributes r(a)
  if r hashes in a nonempty bucket of hash table for S
  then
    {if r hash key matches any s in bucket concatenate r and s
    place relation in Q}};

Implement the Hash Join algorithm in your programming language (optionally providing a test case as well).