Talk:Entropy

From Rosetta Code

Naïve implementation?

I wonder if there is a way to make a smarter way of calculating the entropy. Repetitive patterns should reduce the entropy, for instance. Basically, the entropy should be the number of bits returned by the best possible ever compression program or something. Even better: the size of the smallest computing program that outputs the sequence.--Grondilu 21:11, 21 February 2013 (UTC)

The entropy is calculated based on the probability of a symbol to occur. Therefore the entropy of aab is the same as aba. The higher the probabilty of a symbol to occur the less information it contains and therefore requires less bits to encode. The entropy of aaab is less than the entropy of aab. The entropy states how much bits per symbol is required. If there is a code with an average code length of exactly the entropy it is considered a perfect code (there can't be any better loss-less code). The entropy of a is zero. The entropy of aa is also zero. (Because a has probability p = 1). -- Mroman 21:43, 21 February 2013 (UTC)
Sure, but the entropy of aaabbb should be the same as ab since we could change the symbols aaa to a and bbb to b. Also, it aaabbb should not have the same entropy as say aababb.--Grondilu 22:22, 21 February 2013 (UTC)