Talk:Huffman coding: Difference between revisions

Content added Content deleted
No edit summary
Line 94: Line 94:


By the way, what is exactly wrong? The examplanation? Or the java code? --[[User:ShinTakezou|ShinTakezou]] 16:37, 26 March 2009 (UTC)
By the way, what is exactly wrong? The examplanation? Or the java code? --[[User:ShinTakezou|ShinTakezou]] 16:37, 26 March 2009 (UTC)


==IS This OK?==
I took a <strike>good</strike> better look at the WP article and came up with the following code, together with printouts of what it is doing:
<lang python>from heapq import heappush, heappop, heapify

def encode(symbol2weights, tutor= False):
''' Huffman encode the given dict mapping symbols to weights '''
heap = [ [float(wt), [sym, '']] for sym, wt in symbol2weights.iteritems() ]
heapify(heap)
if tutor: print "ENCODING:", sorted(symbol2weights.iteritems())
while len(heap) >1:
lo = heappop(heap)
hi = heappop(heap)
if tutor: print " COMBINING:", lo, '\n AND:', hi
for i in lo[1:]: i[1] = '0' + i[1]
for i in hi[1:]: i[1] = '1' + i[1]
lohi = [ lo[0] + hi[0] ] + lo[1:] + hi[1:]
if tutor: print " PRODUCING:", lohi, '\n'
heappush(heap, lohi)
return sorted(heappop(heap)[1:], key=lambda x: int(x[-1],2))

#readin = "B 25 C 2.5 D 12.5 A 5 \n"
#readin = "a .1 b .15 c .3 d .16 e .29" # Wikipedia sample
#readin = "a1 .4 a2 .35 a3 .2 a4 .05" # Wikipedia sample
readin = "A 50 B 25 C 12.5 D 12.5" # RC example
cleaned = readin.strip().split()
symbol2weights = dict((symbol, wt)
for symbol, wt in zip(cleaned[0::2], cleaned[1::2]) )
huff = encode(symbol2weights, True)
print "\nSYMBOL\tWEIGHT\tHUFFMAN CODE"
for h in huff:
print "%s\t%s\t%s" % (h[0], symbol2weights[h[0]], h[1])
</lang>

The output, for one of your examples above is:
<pre>ENCODING: [('A', '50'), ('B', '25'), ('C', '12.5'), ('D', '12.5')]
COMBINING: [12.5, ['C', '']]
AND: [12.5, ['D', '']]
PRODUCING: [25.0, ['C', '0'], ['D', '1']]

COMBINING: [25.0, ['B', '']]
AND: [25.0, ['C', '0'], ['D', '1']]
PRODUCING: [50.0, ['B', '0'], ['C', '10'], ['D', '11']]

COMBINING: [50.0, ['A', '']]
AND: [50.0, ['B', '0'], ['C', '10'], ['D', '11']]
PRODUCING: [100.0, ['A', '0'], ['B', '10'], ['C', '110'], ['D', '111']]


SYMBOL WEIGHT HUFFMAN CODE
A 50 0
B 25 10
C 12.5 110
D 12.5 111</pre>