Burrows–Wheeler transform: Difference between revisions

Content added Content deleted
(reorder)
Line 804: Line 804:


=={{header|J}}==
=={{header|J}}==
Working in integer space, all ASCII characters are allowed. Less useful for compression this way.
<pre>
<pre>
NB. transform and inverse
NB. transform and inverse
bwt=: {:"1@:(/:~)@:(|."0 1~ -@:i.@:#)@:(256 , ,&257)@:(a.&([: :i.)) :.(a. {~ }.@:}:@:({~ (257 [: :i.~ {:"1))@:((,"0 1 /:~)^:(#@[)&(0$00)))
bwt=: {:"1@:(/:~ :[:)@:(|."0 1~ -@:i.@:#)@:((2{a.) , ,&(3{a.))@:(([ ('STX or ETX invalid in input' (13!:8) 21 + 0:))^:(1 e. (2 3{a.)&e.)) :.(}.@:}:@:({~ ((3{a.) [: :i.~ {:"1))@:((,"0 1 /:~ :[:)^:(#@[)&(0$00)))


NB. demonstrate the transform
NB. demonstrate the transform
> A=: <@bwt ;._2 ] 0 :0

,. A=: <@bwt ;._2 ] 0 :0
tests[0] = "banana";
tests[0] = "banana";
tests[1] = "appellee";
tests[1] = "appellee";
Line 818: Line 816:
tests[4] = "SIX.MIXED.PIXIES.SIFT.SIXTY.PIXIE.DUST.BOXES",
tests[4] = "SIX.MIXED.PIXIES.SIFT.SIXTY.PIXIE.DUST.BOXES",
)
)
�;� =] a [" s0nnb"taate s
┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
�;� =] e [" s1"elptlepate s
│256 32 32 61 93 32 97 32 91 34 32 115 48 110 110 98 34 116 97 97 116 101 32 115 257 59 │
�;� =] d [" s2o"toodwte sg
├──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
�;� =]OOORREEETTR ? [" TW BBB ATTT NNOOONOO" s3tte s
│256 32 32 61 93 32 101 32 91 34 32 115 49 34 101 108 112 116 108 101 112 97 116 101 32 115 257 59 │
�,� =] S "TEXYDST[ .E.IXXIIXXSSMPPS.B..EE.".USFXDIIOIIITs4tte s
├──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│256 32 32 61 93 32 100 32 91 34 32 115 50 111 34 116 111 111 100 119 116 101 32 115 103 257 59 │
├──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│256 32 32 61 93 79 79 79 82 82 69 69 69 84 84 82 32 63 32 91 34 32 84 87 32 32 32 66 66 66 32 32 65 84 84 84 32 32 32 78 78 79 79 79 78 79 79 34 32 32 32 115 51 116 116 101 32 115 257 59 │
├──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│256 32 32 61 93 32 83 32 34 84 69 88 89 68 83 84 91 32 46 69 46 73 88 88 73 73 88 88 83 83 77 80 80 83 46 66 46 46 69 69 46 34 46 85 83 70 88 68 73 73 79 73 73 73 84 115 52 116 116 101 32 115 257 44│
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘


NB. show the inverse transform
NB. and the restoring transformation
bwt inv &> A
bwt inv&> A
tests[0] = "banana";
tests[0] = "banana";
tests[1] = "appellee";
tests[1] = "appellee";
Line 840: Line 832:
NB. the final test pattern
NB. the final test pattern
bwt a. {~ 2 , (a. i. 'ABC') , 3
bwt a. {~ 2 , (a. i. 'ABC') , 3
|STX or ETX invalid in input: bwt
256 67 2 65 66 257 3
| bwt a.{~2,(a.i.'ABC'),3

bwt inv 256 67 2 65 66 257 3
�ABC�
</pre>
</pre>


<lang>
<lang>
NB. articulate definition
NB. transform articulately defined
NB. local names (assignment =.) won't pollute the namespace
NB. local names (assignment =.) won't pollute the namespace


3 :0'define Burrows-Wheeler transformations'
3 :0'define Burrows-Wheeler transformations'
Dyad=. [: :
Dyad=. [: :
Monad=. :[:
index=. i. Dyad
index=. i. Dyad
Rank=. "
Rank=. "
sort=. /:~
sort=. /:~ Monad


signal_error=. 13!:8
VALUE_ERROR=. 21
'STX ETX'=. 2 3 { a.
verify=. ([ ('STX or ETX invalid in input' signal_error VALUE_ERROR + 0:))^:(1 e. (STX , ETX)&e.)
rotations=. |."0 1~ -@:i.@:#
rotations=. |."0 1~ -@:i.@:#
tail=. {:
tail=. {:
mark=. 256 , ,&257
mark=. STX , ,&ETX
forward=. tail Rank 1 @: sort @: rotations @: mark @: verify
integer=. a.&index
EXPECT=. ETX , 'ANNB' , STX , 'AA'
forward=. tail Rank 1 @: sort @: rotations @: mark @: integer
assert. 66 78 78 256 65 65 257 65 -: forward 'BANANA'
assert. EXPECT -: forward 'BANANA'


unscramble=. (,Rank 0 1 sort)^:(#@[)&(i.0)
unscramble=. (,Rank 0 1 sort)^:(#@[)&(i.0)
find=. 257 index~ tail Rank 1
find=. ETX index~ tail Rank 1
from=. {
from=. {
behead=. }.
behead=. }.
curtail=. }:
curtail=. }:
erase_mark =. behead @: curtail
erase_mark =. behead @: curtail
inverse=. (a. from~ erase_mark @: (from~ find) @: unscramble)
reverse=. erase_mark @: (from~ find) @: unscramble
assert. 'BANANA' -: inverse 66 78 78 256 65 65 257 65
assert. 'BANANA' -: reverse EXPECT


obverse=. :.
obverse=. :.
fixed=. f.
fixed=. f.
bwt=: forward obverse inverse fixed
bwt=: forward obverse reverse fixed


assert (-: ]&.:bwt)'same under transformation'
assert (-: ]&.:bwt)'same under transformation'