Talk:Huffman coding: Difference between revisions
Content added Content deleted
No edit summary |
|||
Line 94: | Line 94: | ||
By the way, what is exactly wrong? The examplanation? Or the java code? --[[User:ShinTakezou|ShinTakezou]] 16:37, 26 March 2009 (UTC) |
By the way, what is exactly wrong? The examplanation? Or the java code? --[[User:ShinTakezou|ShinTakezou]] 16:37, 26 March 2009 (UTC) |
||
==IS This OK?== |
|||
I took a <strike>good</strike> better look at the WP article and came up with the following code, together with printouts of what it is doing: |
|||
<lang python>from heapq import heappush, heappop, heapify |
|||
def encode(symbol2weights, tutor= False): |
|||
''' Huffman encode the given dict mapping symbols to weights ''' |
|||
heap = [ [float(wt), [sym, '']] for sym, wt in symbol2weights.iteritems() ] |
|||
heapify(heap) |
|||
if tutor: print "ENCODING:", sorted(symbol2weights.iteritems()) |
|||
while len(heap) >1: |
|||
lo = heappop(heap) |
|||
hi = heappop(heap) |
|||
if tutor: print " COMBINING:", lo, '\n AND:', hi |
|||
for i in lo[1:]: i[1] = '0' + i[1] |
|||
for i in hi[1:]: i[1] = '1' + i[1] |
|||
lohi = [ lo[0] + hi[0] ] + lo[1:] + hi[1:] |
|||
if tutor: print " PRODUCING:", lohi, '\n' |
|||
heappush(heap, lohi) |
|||
return sorted(heappop(heap)[1:], key=lambda x: int(x[-1],2)) |
|||
#readin = "B 25 C 2.5 D 12.5 A 5 \n" |
|||
#readin = "a .1 b .15 c .3 d .16 e .29" # Wikipedia sample |
|||
#readin = "a1 .4 a2 .35 a3 .2 a4 .05" # Wikipedia sample |
|||
readin = "A 50 B 25 C 12.5 D 12.5" # RC example |
|||
cleaned = readin.strip().split() |
|||
symbol2weights = dict((symbol, wt) |
|||
for symbol, wt in zip(cleaned[0::2], cleaned[1::2]) ) |
|||
huff = encode(symbol2weights, True) |
|||
print "\nSYMBOL\tWEIGHT\tHUFFMAN CODE" |
|||
for h in huff: |
|||
print "%s\t%s\t%s" % (h[0], symbol2weights[h[0]], h[1]) |
|||
</lang> |
|||
The output, for one of your examples above is: |
|||
<pre>ENCODING: [('A', '50'), ('B', '25'), ('C', '12.5'), ('D', '12.5')] |
|||
COMBINING: [12.5, ['C', '']] |
|||
AND: [12.5, ['D', '']] |
|||
PRODUCING: [25.0, ['C', '0'], ['D', '1']] |
|||
COMBINING: [25.0, ['B', '']] |
|||
AND: [25.0, ['C', '0'], ['D', '1']] |
|||
PRODUCING: [50.0, ['B', '0'], ['C', '10'], ['D', '11']] |
|||
COMBINING: [50.0, ['A', '']] |
|||
AND: [50.0, ['B', '0'], ['C', '10'], ['D', '11']] |
|||
PRODUCING: [100.0, ['A', '0'], ['B', '10'], ['C', '110'], ['D', '111']] |
|||
SYMBOL WEIGHT HUFFMAN CODE |
|||
A 50 0 |
|||
B 25 10 |
|||
C 12.5 110 |
|||
D 12.5 111</pre> |