Arithmetic coding/As a generalized change of radix: Difference between revisions

J: clean up noise from mis-reading of wikipedia description
m (J: link to the wikipedia text which the task description uses to specify this task)
(J: clean up noise from mis-reading of wikipedia description)
Line 193:
 
<lang J>aek=:3 :0
b=. x:#y
f=. (#/.~{~~.i.]) y
c=. _65+a.i.y
L=. b #. c**/\1,}:f
p=. */f
e=. x:<.10^.p
e,~<.(L+p)%10^e
)</lang>
 
Results are number pairs where the second number is the number of zeros to append to the first.
 
Example use:
 
<lang J> aek 'DABDDB'
251 2
aek 'DABDDBBDDBA'
83677 6
aek 'ABRACADABRA'
4860830 4
aek 'TOBEORNOTTOBEORTOBEORNOT'
1224972022395235072 15</lang>
 
(See talk page.)
 
If, however, we ignore the [[wp:Arithmetic_coding#Arithmetic_coding_as_a_generalized_change_of_radix|wikipedia suggestion]] that letters have predefined values (e.g. C = 2), but instead use cumulative frequency (in order from least common to most common and leaving in order of appearance for letters with the same frequencies, we get this:
 
<lang J>aekc=:3 :0
b=. x:#y
i=. (~.i.])y
n=. #/.~y
f=. i{n
c=. i{(+/\0,}:/:~n)/:/:n
L=. b #. c**/\1,}:f
p=. */f
e=. x:<.10^.p
e,~<.(L+p)%10^e
)</lang>
 
With encoded values:
 
<lang J> aekc 'DABDDB'
251 2
aekc 'DABDDBBDDBA'
167351 6
aekc 'ABRACADABRA'
19022571 4
aekc 'TOBEORNOTTOBEORTOBEORNOT'
807796941974181890 15</lang>
 
If we instead use cumulative frequencies, cumulated strictly in alphabetic order, we get this:
 
<lang J>aeka=:3 :0
b=. x:#y
i=. (~.i.])y
Line 257 ⟶ 205:
)</lang>
 
<lang J> aekaaek 'DABDDB'
251 2
aekaaek 'DABDDBBDDBA'
167351 6
aeka 'ABRACADABRA'
7954170 4
aekaaek 'TOBEORNOTTOBEORTOBEORNOT'
1150764267498783364 15</lang>
 
6,962

edits