Talk:Huffman coding: Difference between revisions
Content added Content deleted
(No it is) |
(Explanation of huffman coding) |
||
Line 2: | Line 2: | ||
: Yep. It is not Huffman coding. (Hmm, I do like the wikipedia description of the two queue method though ...) --[[User:Paddy3118|Paddy3118]] 06:56, 26 March 2009 (UTC) |
: Yep. It is not Huffman coding. (Hmm, I do like the wikipedia description of the two queue method though ...) --[[User:Paddy3118|Paddy3118]] 06:56, 26 March 2009 (UTC) |
||
::I did it based on what I learned in class today. If you look at the "Basic technique" section on the WP it shows codes identical to ones I used in the example so I'm pretty sure it is Huffman coding. There must be a few ways to generate them that give different actual codes with the same idea. --[[User:Mwn3d|Mwn3d]] 13:44, 26 March 2009 (UTC) |
::I did it based on what I learned in class today. If you look at the "Basic technique" section on the WP it shows codes identical to ones I used in the example so I'm pretty sure it is Huffman coding. There must be a few ways to generate them that give different actual codes with the same idea. --[[User:Mwn3d|Mwn3d]] 13:44, 26 March 2009 (UTC) |
||
::: For the example given, the Huffman code indeed looks like this. But as a general algorithm, it's wrong. What you do is that you "combine" the last two elements in the table, and sort them back into your list. So starting with the example, |
|||
(A=?): 50% |
|||
(B=?): 25% |
|||
(C=?): 12.5% |
|||
(D=?): 12.5% |
|||
::: you first assign the bit to the last two items (C and D), and then ''combine'' them, adding the frequencies, and ''sorting it into the right place'' |
|||
(A=?): 50% |
|||
(B=?): 25% |
|||
(C=?0,D=?1): 25% |
|||
::: Then you do the same again: |
|||
(A=?): 50% |
|||
(B=?0, C=?10, D=?11): 50% |
|||
::: And finally you get |
|||
(A=0, B=10, C=110, D=1110): 100% |
|||
::: Thus the result is indeed as given in the example. However, assume that you start with |
|||
(A=?): 25% |
|||
(B=?): 25% |
|||
(C=?): 25% |
|||
(D=?): 25% |
|||
::: Your algorithm would still give the same result, while it's obvious that using just the standard two-bit numbering is optimal for this case. And indeed, the first step of the Huffman coding gives: |
|||
(C=?0, D=?1): 50% |
|||
(A=?): 25% |
|||
(B=?): 25% |
|||
::: Note how the (C,D) case moves up, because it's probability is larger than the 25% of each of A and B. Therefore the next step combines A and B: |
|||
(A=?0, B=?1): 50% |
|||
(C=?0, D=?1): 50% |
|||
::: (I've made the convention that items with the same probability are sorted lexicographically; of course other conventions, like e.g. leaving the newly formed pair as low as possible, also work). Now our final combination provides: |
|||
(A=00, B=01, C=10, D=11): 100% |
|||
::: which obviously is an optimal code for this case. --[[User:Ce|Ce]] 14:17, 26 March 2009 (UTC) |