Entropy: Difference between revisions

5,629 bytes added ,  3 months ago
m
m (syntax highlighting fixup automation)
imported>Thebeez
 
(21 intermediate revisions by 13 users not shown)
Line 243:
 
=={{header|Arturo}}==
 
<syntaxhighlight lang="rebol">entropy: function [s][
t: #[]
Line 286 ⟶ 285:
 
=={{header|AWK}}==
<syntaxhighlight lang="awk">#!/usr/bin/awk -f
{
N = length
for (i=1; i<= length($0); i++) {
for (i = 1; i <= N; ++i)
H[substr($0,i,1)]++;
++H[substr($0, i, 1)]
N++;
}
}
 
END {
for (i in H) {
p S += H[i]/N; * log(H[i])
E print (log(N) -= S p/ N) */ log(p2);
}
print E/log(2);
}</syntaxhighlight>
{{out|Usage}}
<syntaxhighlight lang="bashsh"> echo 1223334444 |./entropy.awk
1.84644 </syntaxhighlight>
 
=={{header|BASIC}}==
Line 388 ⟶ 384:
{{out}}
<pre>1.8464393</pre>
==={{header|uBasic/4tH}}===
{{Trans|QBasic}}
uBasic/4tH is an integer BASIC only. So, fixed point arithmetic is required go fulfill this task. Some loss of precision is unavoidable.
<syntaxhighlight lang="basic">If Info("wordsize") < 64 Then Print "This program requires a 64-bit uBasic" : End
 
s := "1223334444"
u := ""
x := FUNC(_Fln(FUNC(_Ntof(2)))) ' calculate LN(2)
 
For i = 0 TO Len(s)-1
k = 0
For j = 0 TO Len(u)-1
If Peek(u, j) = Peek(s, i) Then k = 1
Next
If k = 0 THEN u = Join(u, Char (Peek (s, i)))
Next
 
Dim @r(Len(u)-1)
 
For i = 0 TO Len(u)-1
c = 0
For J = 0 TO Len(s)-1
If Peek(u, i) = Peek (s, j) Then c = c + 1
Next
q = FUNC(_Fdiv(c, Len(s)))
@r(i) = FUNC(_Fmul(q, FUNC(_Fdiv(FUNC(_Fln(q)), x))))
Next
 
e = 0
For i = 0 To Len(u) - 1
e = e - @r(i)
Next
 
Print Using "+?.####"; FUNC(_Ftoi(e))
 
End
 
_Fln Param (1) : Return (FUNC(_Ln(a@*4))/4)
_Fmul Param (2) : Return ((a@*b@)/16384)
_Fdiv Param (2) : Return ((a@*16384)/b@)
_Ntof Param (1) : Return (a@*16384)
_Ftoi Param (1) : Return ((10000*a@)/16384)
 
_Ln
Param (1)
Local (2)
 
c@=681391
If (a@<32768) Then a@=SHL(a@, 16) : c@=c@-726817
If (a@<8388608) Then a@=SHL(a@, 8) : c@=c@-363409
If (a@<134217728) Then a@=SHL(a@, 4) : c@=c@-181704
If (a@<536870912) Then a@=SHL(a@, 2) : c@=c@-90852
If (a@<1073741824) Then a@=SHL(a@, 1) : c@=c@-45426
b@=a@+SHL(a@, -1) : If (AND(b@, 2147483648)) = 0 Then a@=b@ : c@=c@-26573
b@=a@+SHL(a@, -2) : If (AND(b@, 2147483648)) = 0 Then a@=b@ : c@=c@-14624
b@=a@+SHL(a@, -3) : If (AND(b@, 2147483648)) = 0 Then a@=b@ : c@=c@-7719
b@=a@+SHL(a@, -4) : If (AND(b@, 2147483648)) = 0 Then a@=b@ : c@=c@-3973
b@=a@+SHL(a@, -5) : If (AND(b@, 2147483648)) = 0 Then a@=b@ : c@=c@-2017
b@=a@+SHL(a@, -6) : If (AND(b@, 2147483648)) = 0 Then a@=b@ : c@=c@-1016
b@=a@+SHL(a@, -7) : If (AND(b@, 2147483648)) = 0 Then a@=b@ : c@=c@-510
a@=2147483648-a@;
c@=c@-SHL(a@, -15)
Return (c@)</syntaxhighlight>
{{Out}}
<pre>1.8461
 
0 OK, 0:638</pre>
 
=={{header|BBC BASIC}}==
Line 430 ⟶ 493:
 
=={{header|C}}==
 
<syntaxhighlight lang="c">#include <stdio.h>
#include <stdlib.h>
Line 669 ⟶ 731:
 
=={{header|Common Lisp}}==
 
Not very Common Lisp-y version:
 
<syntaxhighlight lang="lisp">(defun entropy (string)
(let ((table (make-hash-table :test 'equal))
Line 755 ⟶ 815:
{{out}}
<pre>1.84644</pre>
 
=={{header|Delphi}}==
{{libheader| StrUtils}}
Line 821 ⟶ 882:
readln;
end.</syntaxhighlight>
 
=={{header|EasyLang}}==
<syntaxhighlight>
func entropy s$ .
len d[] 255
for c$ in strchars s$
d[strcode c$] += 1
.
for cnt in d[]
if cnt > 0
prop = cnt / len s$
entr -= (prop * log10 prop / log10 2)
.
.
return entr
.
print entropy "1223334444"
</syntaxhighlight>
 
=={{header|EchoLisp}}==
<syntaxhighlight lang="scheme">
Line 858 ⟶ 938:
 
</syntaxhighlight>
 
 
=={{header|Elena}}==
{{trans|C#}}
ELENA 56.0x :
<syntaxhighlight lang="elena">import system'math;
import system'collections;
Line 880 ⟶ 959:
var table := Dictionary.new();
input.forEach::(ch)
{
var n := table[ch];
Line 894 ⟶ 973:
var freq := 0;
table.forEach::(letter)
{
freq := letter.toInt().realDiv(input.Length);
Line 996 ⟶ 1,075:
>entropy("1223334444")
1.84643934467</syntaxhighlight>
 
=={{header|Excel}}==
This solution uses the <code>LAMBDA</code>, <code>LET</code>, and <code>MAP</code> functions introduced into the Microsoft 365 version of Excel in 2021. The <code>LET</code> function is able to use functions as first class citizens. Taking advantage of this makes the solution much simpler. The solution below looks for the string in cell <code>A1</code>.
<syntaxhighlight lang="excel">
=LET(
_MainS,A1,
_N,LEN(_MainS),
_Chars,UNIQUE(MID(_MainS,SEQUENCE(LEN(_MainS),1,1,1),1)),
calcH,LAMBDA(_c,(_c/_N)*LOG(_c/_N,2)),
getCount,LAMBDA(_i,LEN(_MainS)-LEN(SUBSTITUTE(_MainS,_i,""))),
_CharMap,MAP(_Chars,LAMBDA(a, calcH(getCount(a)))),
-SUM(_CharMap)
)
</syntaxhighlight>
_Chars uses the <code>SEQUENCE</code> function to split the text into an array. The <code>UNIQUE</code> function then returns a list of unique characters in the string.
 
<code>calcH</code> applies the calculation described at the top of the page that will then be summed for each character
 
<code>getCount</code> uses the <code>SUBSTITUTE</code> method to count the occurrences of a character within the string.
 
If you needed to re-use this calculation then you could wrap it in a <code>LAMBDA</code> function within the name manager, changing <code>A1</code> to a variable name (e.g. <code>String</code>):
<syntaxhighlight lang="excel">
ShannonEntropyH2=LAMBDA(String,LET(_MainS,String,_N,LEN(_MainS),_Chars,UNIQUE(MID(_MainS,SEQUENCE(LEN(_MainS),1,1,1),1)),calcH,LAMBDA(_c,(_c/_N)*LOG(_c/_N,2)),getCount,LAMBDA(_i,LEN(_MainS)-LEN(SUBSTITUTE(_MainS,_i,""))),_CharMap,MAP(_Chars,LAMBDA(a, calcH(getCount(a)))),-SUM(_CharMap)))
</syntaxhighlight>
Then you can just use the named lambda. E.g. If A1 = 1223334444 then:
<syntaxhighlight lang="excel">
=ShannonEntropyH2(A1)
</syntaxhighlight>
Returns 1.846439345
 
 
 
=={{header|F_Sharp|F#}}==
Line 1,055 ⟶ 1,165:
 
=={{header|Fortran}}==
 
Please find the GNU/linux compilation instructions along with sample run among the comments at the start of the FORTRAN 2008 source. This program acquires input from the command line argument, thereby demonstrating the fairly new get_command_argument intrinsic subroutine. The expression of the algorithm is a rough translated of the j solution. Thank you.
<syntaxhighlight lang="fortran">
Line 1,191 ⟶ 1,300:
=={{header|Fōrmulæ}}==
 
{{FormulaeEntry|page=https://formulae.org/?script=examples/Entropy}}
Fōrmulæ programs are not textual, visualization/edition of programs is done showing/manipulating structures but not text. Moreover, there can be multiple visual representations of the same program. Even though it is possible to have textual representation &mdash;i.e. XML, JSON&mdash; they are intended for storage and transfer purposes more than visualization and edition.
 
'''Solution'''
Programs in Fōrmulæ are created/edited online in its [https://formulae.org website], However they run on execution servers. By default remote servers are used, but they are limited in memory and processing power, since they are intended for demonstration and casual use. A local server can be downloaded and installed, it has no limitations (it runs in your own computer). Because of that, example programs can be fully visualized and edited, but some of them will not run if they require a moderate or heavy computation/memory resources, and no local server is being used.
 
[[File:Fōrmulæ - Entropy 01.png]]
In '''[https://formulae.org/?example=Entropy this]''' page you can see the program(s) related to this task and their results.
 
'''Test case'''
 
[[File:Fōrmulæ - Entropy 02.png]]
 
[[File:Fōrmulæ - Entropy 03.png]]
 
[[File:Fōrmulæ - Entropy 04.png]]
 
[[File:Fōrmulæ - Entropy 05.png]]
 
=={{header|Go}}==
Line 1,211 ⟶ 1,330:
}
 
// for ASCII strings
func H(data string) (entropy float64) {
if data == "" {
Line 1,239 ⟶ 1,359:
const s = "1223334444"
 
l := float64(0)
m := map[rune]float64{}
for _, r := range s {
m[r]++
l++
}
var hm float64
Line 1,247 ⟶ 1,369:
hm += c * math.Log2(c)
}
const l = float64(len(s))
fmt.Println(math.Log2(l) - hm/l)
}</syntaxhighlight>
Line 1,313 ⟶ 1,434:
 
=={{header|Icon}} and {{header|Unicon}}==
 
Hmmm, the 2nd equation sums across the length of the string (for the
example, that would be the sum of 10 terms). However, the answer cited
Line 1,450 ⟶ 1,570:
2
3
4</pre> =={{header|JavaScript}}==
;Another variant
<syntaxhighlight lang="javascript">const entropy = (s) => {
const split = s.split('');
Line 1,530 ⟶ 1,651:
=={{header|Julia}}==
{{works with|Julia|0.6}}
 
<syntaxhighlight lang="julia">entropy(s) = -sum(x -> x * log(2, x), count(x -> x == c, s) / length(s) for c in unique(s))
@show entropy("1223334444")
Line 1,538 ⟶ 1,658:
<pre>entropy("1223334444") = 1.8464393446710154
entropy([1, 2, 3, 1, 2, 1, 2, 3, 1, 2, 3, 4, 5]) = 2.103909910282364</pre>
 
=={{header|K}}==
{{works with|ngn/k}}
<syntaxhighlight lang="k">entropy: {(`ln[#x]-(+/{x*`ln@x}@+/{x=\:?x}x)%#x)%`ln@2}
 
entropy "1223334444"</syntaxhighlight>
{{out}}
<pre>1.8464393446710161</pre>
 
=={{header|Kotlin}}==
<syntaxhighlight lang="scalakotlin">// version 1.0.6
 
fun log2(d: Double) = Math.log(d) / Math.log(2.0)
Line 1,573 ⟶ 1,701:
for (sample in samples) println("${sample.padEnd(36)} -> ${"%18.16f".format(shannon(sample))}")
}</syntaxhighlight>
 
{{out}}
<pre>
Line 1,586 ⟶ 1,713:
Rosetta Code -> 3.0849625007211556
</pre>
 
=={{header|Ksh}}==
{{works with|ksh93}}
<syntaxhighlight lang="ksh">function entropy {
typeset -i i len=${#1}
typeset -X13 r=0
typeset -Ai counts
 
for ((i = 0; i < len; ++i))
do
counts[${1:i:1}]+=1
done
for i in "${counts[@]}"
do
r+='i * log2(i)'
done
r='log2(len) - r / len'
print -r -- "$r"
}
 
printf '%g\n' "$(entropy '1223334444')"</syntaxhighlight>
{{out}}
<pre>1.84644</pre>
 
=={{header|Lambdatalk}}==
Line 1,805 ⟶ 1,955:
{{out}}
<pre> 1.8464394E+00</pre>
 
=={{header|NetRexx}}==
{{trans|REXX}}
Line 1,906 ⟶ 2,057:
 
=={{header|Objeck}}==
 
<syntaxhighlight lang="objeck">use Collection;
 
Line 1,968 ⟶ 2,118:
 
=={{header|OCaml}}==
;By using a map, purely functional
<syntaxhighlight lang="ocaml">(* generic OCaml, using a mutable Hashtbl *)
<syntaxhighlight lang="ocaml">module CharMap = Map.Make(Char)
 
let entropy s =
let count map c =
CharMap.update c (function Some n -> Some (n +. 1.) | None -> Some 1.) map
and calc _ n sum =
sum +. n *. Float.log2 n
in
let sum = CharMap.fold calc (String.fold_left count CharMap.empty s) 0.
and len = float (String.length s) in
Float.log2 len -. sum /. len
 
let () =
entropy "1223334444" |> string_of_float |> print_endline</syntaxhighlight>
;By using a mutable Hashtbl
<syntaxhighlight lang="ocaml">
(* pre-bake & return an inner-loop function to bin & assemble a character frequency map *)
let get_fproc (m: (char, int) Hashtbl.t) :(char -> unit) =
Line 1,999 ⟶ 2,164:
-1.0 *. List.fold_left (fun b x -> b +. calc x) 0.0 relative_probs
</syntaxhighlight>
{{out}}
 
<pre>1.84643934467</pre>
'''output:'''
 
1.84643934467
 
=={{header|Oforth}}==
 
<syntaxhighlight lang="oforth">: entropy(s) -- f
| freq sz |
Line 2,061 ⟶ 2,223:
 
=={{header|Pascal}}==
 
Free Pascal (http://freepascal.org).
 
<syntaxhighlight lang="pascal">
PROGRAM entropytest;
Line 2,169 ⟶ 2,329:
 
=={{header|PHP}}==
 
<syntaxhighlight lang="php"><?php
 
Line 2,318 ⟶ 2,477:
 
=={{header|Prolog}}==
 
{{works with|Swi-Prolog|7.3.3}}
 
This solution calculates the run-length encoding of the input string to get the relative frequencies of its characters.
 
<syntaxhighlight lang="prolog">:-module(shannon_entropy, [shannon_entropy/2]).
 
Line 2,539 ⟶ 2,695:
 
===Python: More succinct version===
 
The <tt>Counter</tt> module is only available in Python >= 2.7.
<syntaxhighlight lang="python">from math import log2
from collections import Counter
 
def entropy(s):
<syntaxhighlight lang="python">>>> import math
p, lns = Counter(s), float(len(s))
>>> from collections import Counter
return log2(lns) - sum(count * log2(count) for count in p.values()) / lns
>>>
 
>>> def entropy(s):
print(entropy("1223334444"))</syntaxhighlight>
... p, lns = Counter(s), float(len(s))
{{out}}
... return -sum( count/lns * math.log(count/lns, 2) for count in p.values())
<pre>1.8464393446710154</pre>
...
>>> entropy("1223334444")
1.8464393446710154
>>> </syntaxhighlight>
 
===Uses Python 2===
Line 2,575 ⟶ 2,729:
 
=={{header|R}}==
 
<syntaxhighlight lang="rsplus">
entropy <- function(str) {
Line 2,885 ⟶ 3,038:
Entropy of 1223334444 is 1.84643934 bits.
The result should be around 1.84644 bits.
</pre>
 
=={{header|RPL}}==
{{works with|Halcyon Calc|4.2.7}}
{| class="wikitable"
! Code
! Comments
|-
|
DUP SIZE 2 LN → str len log2
≪ { 255 } 0 CON
1 len '''FOR''' j
str j DUP SUB
NUM DUP2 GET 1 + PUT
'''NEXT'''
0 1 255 '''FOR''' j
'''IF''' OVER j GET
'''THEN''' LAST len / DUP LN log2 / * + '''END'''
'''NEXT'''
NEG SWAP DROP
≫ ≫ '<span style="color:blue">NTROP</span>' STO
|
<span style="color:blue">NTROP</span> ''( "string" -- entropy )''
Initialize local variables
Initialize a vector with 255 counters
For each character in the string...
... increase the counter according to ASCII code
For each non-zero counter
calculate term
Change sign and forget the vector
|}
The following line of code delivers what is required:
"1223334444" <span style="color:blue">NTROP</span>
{{out}}
<pre>
1: 1.84643934467
</pre>
 
=={{header|Ruby}}==
{{works with|Ruby|1.9}}
<syntaxhighlight lang="ruby">def entropy(s)
counts = Hashs.new(0chars.0)tally
leng = s.length.to_f
s.each_char { |c| counts[c] += 1 }
leng = s.length
counts.values.reduce(0) do |entropy, count|
freq = count / leng
Line 2,905 ⟶ 3,098:
1.8464393446710154
</pre>
One-liner, same performance (or better):
<syntaxhighlight lang="ruby">def entropy2(s)
s.each_char.group_by(&:to_s).values.map { |x| x.length / s.length.to_f }.reduce(0) { |e, x| e - x*Math.log2(x) }
end</syntaxhighlight>
 
=={{header|Run BASIC}}==
Line 3,099 ⟶ 3,288:
1.84644
</pre>
 
=={{header|SETL}}==
<syntaxhighlight lang="setl">program shannon_entropy;
print(entropy "1223334444");
 
op entropy(symbols);
hist := {};
loop for symbol in symbols do
hist(symbol) +:= 1;
end loop;
h := 0.0;
loop for count = hist(symbol) do
f := count / #symbols;
h -:= f * log f / log 2;
end loop;
return h;
end op;
end program; </syntaxhighlight>
{{out}}
<pre>1.84643934467102</pre>
 
=={{header|Sidef}}==
Line 3,130 ⟶ 3,339:
Entropy "1223334444" ;
val it = 1.846439345: real
 
=={{header|Swift}}==
<syntaxhighlight lang="swift">import Foundation
Line 3,169 ⟶ 3,379:
</pre>
 
=={{header|V (Vlang)}}==
 
===Vlang: Map version===
<syntaxhighlight lang="v (vlang)">import math
import arrays
 
Line 3,208 ⟶ 3,417:
=={{header|Wren}}==
{{trans|Go}}
<syntaxhighlight lang="ecmascriptwren">var s = "1223334444"
var m = {}
for (c in s) {
Anonymous user