Burrows–Wheeler transform: Difference between revisions

m
syntax highlighting fixup automation
m (→‎{{header|Phix}}: added syntax colouring, marked p2js compatible)
m (syntax highlighting fixup automation)
Line 18:
{{trans|Python}}
 
<langsyntaxhighlight lang=11l>F bwt(String =s)
‘Apply Burrows-Wheeler transform to input string.’
assert("\002" !C s & "\003" !C s, ‘Input string cannot contain STX and ETX characters’)
Line 45:
print(‘After transformation: ’transformed.replace("\2", ‘^’).replace("\3", ‘|’))
print(‘After inverse transformation: ’invTransformed)
print()</langsyntaxhighlight>
 
{{out}}
Line 72:
 
=={{header|BQN}}==
<langsyntaxhighlight lang=BQN>stx←@+2
BWT ← { # Burrows-Wheeler Transform and its inverse as an invertible function
𝕊: "Input contained STX"!⊑stx¬∘∊𝕩 ⋄ (⍋↓𝕩) ⊏ stx∾𝕩;
𝕊⁼: 1↓(⊑˜⍟(↕≠𝕩)⟜⊑ ⍋)⊸⊏ 𝕩
}
</syntaxhighlight>
</lang>
Example use:
<syntaxhighlight lang=text> BWT "banana"
"annb␂aa"
BWT⁼ BWT "banana"
Line 106:
BWT "␂abc"
Error: Input contained STX
</syntaxhighlight>
</lang>
 
=={{header|C}}==
{{trans|Python}}
<langsyntaxhighlight lang=c>#include <string.h>
#include <stdio.h>
#include <stdlib.h>
Line 208:
}
return 0;
}</langsyntaxhighlight>
 
{{out}}
Line 239:
=={{header|C sharp|C#}}==
{{trans|D}}
<langsyntaxhighlight lang=csharp>using System;
using System.Collections.Generic;
using System.Linq;
Line 338:
}
}
}</langsyntaxhighlight>
{{out}}
<pre>banana
Line 366:
=={{header|C++}}==
{{trans|C#}}
<langsyntaxhighlight lang=cpp>#include <algorithm>
#include <iostream>
#include <vector>
Line 464:
 
return 0;
}</langsyntaxhighlight>
{{out}}
<pre>banana
Line 492:
=={{header|D}}==
{{trans|Kotlin}}
<langsyntaxhighlight lang=d>import std.algorithm.iteration;
import std.algorithm.mutation;
import std.algorithm.searching;
Line 565:
writeln;
}
}</langsyntaxhighlight>
{{out}}
<pre>banana
Line 593:
=={{header|Factor}}==
Factor has a Burrows-Wheeler transform implementation in its standard library. In addition to the transformed sequence, the <code>bwt</code> word also outputs an index for use with the inverse <code>ibwt</code> word.
<langsyntaxhighlight lang=factor>USING: formatting io kernel math.transforms.bwt sequences ;
{
"banana" "dogwood" "TO BE OR NOT TO BE OR WANT TO BE OR NOT?"
Line 601:
2dup " bwt-->%3d %u\n" printf
ibwt " ibwt-> %u\n" printf nl
] each</langsyntaxhighlight>
{{out}}
<pre>
Line 623:
=={{header|Go}}==
{{trans|Python}}
<langsyntaxhighlight lang=go>package main
 
import (
Line 697:
fmt.Println(" -->", r, "\n")
}
}</langsyntaxhighlight>
 
{{out}}
Line 728:
=={{header|Groovy}}==
{{trans|Java}}
<langsyntaxhighlight lang=groovy>class BWT {
private static final String STX = "\u0002"
private static final String ETX = "\u0003"
Line 801:
}
}
}</langsyntaxhighlight>
{{out}}
<pre>banana
Line 828:
 
=={{header|Haskell}}==
<langsyntaxhighlight lang=haskell>-- A straightforward, inefficient implementation of the Burrows–Wheeler
-- transform, based on the description in the Wikipedia article.
--
Line 872:
prettyVal (In c) = c
prettyVal Pre = '^'
prettyVal Post = '|'</langsyntaxhighlight>
 
{{out}}
Line 929:
</pre>
 
<syntaxhighlight lang=text>
NB. articulate definition
NB. local names (assignment =.) won't pollute the namespace
Line 968:
EMPTY
)
</syntaxhighlight>
</lang>
 
=={{header|Java}}==
{{trans|Kotlin}}
<langsyntaxhighlight lang=java>import java.util.ArrayList;
import java.util.List;
 
Line 1,048:
}
}
}</langsyntaxhighlight>
{{out}}
<pre>banana
Line 1,078:
{{works with|jq}}
'''Works with gojq, the Go implementation of jq'''
<langsyntaxhighlight lang=jq># substitute ^ for STX and | for ETX
def makePrintable:
if . == null then null
Line 1,103:
| first( .[] | select(endswith("\u0003")))
| .[1:-1] ;
</syntaxhighlight>
</lang>
'''Tests'''
<syntaxhighlight lang=jq>
<lang jq>
def tests:
(
Line 1,121:
(if $t then " --> \($t | ibwt)\n" else "" end) ;
 
tests</langsyntaxhighlight>
{{out}}
<pre>
Line 1,149:
 
=={{header|Julia}}==
<langsyntaxhighlight lang=julia>bwsort(vec) = sort(vec, lt = (a, b) -> string(a) < string(b))
 
function burrowswheeler_encode(s)
Line 1,176:
"\nInverse transformation: ", burrowswheeler_decode(burrowswheeler_encode(s)), "\n")
end
</langsyntaxhighlight>{{out}}
<pre>
Original: BANANA
Line 1,199:
=={{header|Kotlin}}==
{{trans|Python}}
<langsyntaxhighlight lang=scala>// Version 1.2.60
 
const val STX = "\u0002"
Line 1,259:
println(" --> $r\n")
}
}</langsyntaxhighlight>
 
{{output}}
Line 1,289:
 
=={{header|Ksh}}==
<langsyntaxhighlight lang=ksh>
#!/bin/ksh
 
Line 1,396:
echo
done
</syntaxhighlight>
</lang>
{{out}}<pre>
BANANA
Line 1,423:
=={{header|Lua}}==
{{trans|Java}}
<langsyntaxhighlight lang=lua>STX = string.char(tonumber(2,16))
ETX = string.char(tonumber(3,16))
 
Line 1,507:
end
 
main()</langsyntaxhighlight>
{{out}}
<pre>banana
Line 1,534:
 
=={{header|Mathematica}} / {{header|Wolfram Language}}==
<langsyntaxhighlight lang=Mathematica>ClearAll[BurrowWheeler, InverseBurrowWheeler]
BurrowWheeler[sin_String, {bdelim_, edelim_}] := Module[{s},
s = Characters[bdelim <> sin <> edelim];
Line 1,555:
InverseBurrowWheeler[%, {"^", "|"}]
BurrowWheeler["dogwood", {"^", "|"}]
InverseBurrowWheeler[%, {"^", "|"}]</langsyntaxhighlight>
{{out}}
<pre>|ANNB^AA
Line 1,567:
Translation of the Wikipedia Python algorithm. The program uses characters '¹' and '²' to display STX and ETX.
 
<langsyntaxhighlight lang=Nim>import algorithm
from sequtils import repeat
import strutils except repeat
Line 1,633:
echo "Original text: ", text
echo "After transformation: ", transformed.displaybleString()
echo "After inverse transformation: ", invTransformed</langsyntaxhighlight>
 
{{out}}
Line 1,661:
 
The first character in a Pascal string has index 1, but it's more convenient to describe the algorithm in terms of zero-based indices. The constant STR_BASE = 1 is used to make it clear where Pascal usage has been taken into account.
<langsyntaxhighlight lang=pascal>
program BurrowsWheeler;
 
Line 1,845:
Test( 'SIX.MIXED.PIXIES.SIFT.SIXTY.PIXIE.DUST.BOXES');
end.
</syntaxhighlight>
</lang>
{{out}}
<pre>
Line 1,886:
=={{header|Perl}}==
{{trans|Raku}}
<langsyntaxhighlight lang=perl>use utf8;
binmode STDOUT, ":utf8";
 
Line 1,921:
}
 
print join "\n", @res;</langsyntaxhighlight>
{{out}}
<pre>Original: BANANA
Line 1,942:
An efficient implementation, based mainly on http://spencer-carroll.com/an-easy-to-understand-explanation-of-the-burrows-wheeler-transform/ <br>
Takes around about ten seconds to transform and invert a 100K string. Note: requires 0.8.0+
<!--<langsyntaxhighlight lang=Phix>(phixonline)-->
<span style="color: #000080;font-style:italic;">-- demo\rosetta\burrows_wheeler.exw
--/*
Line 2,088:
<span style="color: #000000;">test</span><span style="color: #0000FF;">(</span><span style="color: #008000;">"TO BE OR NOT TO BE OR WANT TO BE OR NOT?"</span><span style="color: #0000FF;">)</span>
<span style="color: #000000;">test</span><span style="color: #0000FF;">(</span><span style="color: #008000;">"SIX.MIXED.PIXIES.SIFT.SIXTY.PIXIE.DUST.BOXES"</span><span style="color: #0000FF;">)</span>
<!--</langsyntaxhighlight>-->
{{out}}
<pre>
Line 2,108:
Ref: [https://www.codespeedy.com/burrows-wheeler-transform-in-python/ Burrows Wheeler Transform in Python]
 
<langsyntaxhighlight lang=Python>
def bwt(s):
"""Apply Burrows-Wheeler transform to input string."""
Line 2,125:
s = [row for row in table if row.endswith("\003")][0] # Find the correct row (ending in ETX)
return s.rstrip("\003").strip("\002") # Get rid of start and end markers
</syntaxhighlight>
</lang>
 
=={{header|Raku}}==
Line 2,131:
{{works with|Rakudo|2018.06}}
 
<syntaxhighlight lang=raku perl6line># STX can be any character that doesn't appear in the text.
# Using a visible character here for ease of viewing.
Line 2,158:
say 'Inverse transformed: ', ɯɹoɟsuɐɹʇ transform $phrase;
say '';
}</langsyntaxhighlight>
{{out}}
<pre>Original: BANANA
Line 2,182:
Programming note: &nbsp; a little bit of code was added to support more (legal) characters in the input string for the '''BWT'''
<br>function. &nbsp; The error recovery and error messages are rudimentary when an illegal character in the input is detected.
<langsyntaxhighlight lang=rexx>/*REXX program performs a Burrows─Wheeler transform (BWT) on a character string(s). */
$.= /*the default text for (all) the inputs*/
parse arg $.1 /*obtain optional arguments from the CL*/
Line 2,239:
_= @.j; @.j= @.k; @.k= _; ok= 0 /*swap two elements; flag as not done.*/
end /*j*/
end /*m*/; return</langsyntaxhighlight>
{{out|output|text=&nbsp; when using the default inputs:}}
<pre>
Line 2,272:
=={{header|Ruby}}==
{{trans|C#}}
<langsyntaxhighlight lang=ruby>STX = "\u0002"
ETX = "\u0003"
 
Line 2,341:
end
 
main()</langsyntaxhighlight>
{{out}}
<pre>banana
Line 2,368:
 
=={{header|Rust}}==
<langsyntaxhighlight lang=rust>
use core::cmp::Ordering;
 
Line 2,466:
}
}
</syntaxhighlight>
</lang>
{{out}}
<pre>
Line 2,482:
=={{header|Scala}}==
{{trans|Kotlin}}
<langsyntaxhighlight lang=scala>import scala.collection.mutable.ArrayBuffer
 
object BWT {
Line 2,547:
})
}
}</langsyntaxhighlight>
{{out}}
<pre>banana
Line 2,574:
=={{header|Sidef}}==
{{trans|Python}}
<langsyntaxhighlight lang=ruby>class BurrowsWheelerTransform (String L = "\002") {
 
method encode(String s) {
Line 2,602:
say "BWT(#{dec.dump}) = #{enc.dump}"
assert_eq(str, dec)
}</langsyntaxhighlight>
{{out}}
<pre>
Line 2,616:
{{trans|Kotlin}}
 
<langsyntaxhighlight lang=swift>import Foundation
 
private let stx = "\u{2}"
Line 2,670:
 
print("\(readableBwt(test)) -> \(readableBwt(b)) -> \(readableBwt(c))")
}</langsyntaxhighlight>
 
{{out}}
Line 2,683:
=={{header|Visual Basic .NET}}==
{{trans|C#}}
<langsyntaxhighlight lang=vbnet>Module Module1
 
ReadOnly STX As Char = Chr(&H2)
Line 2,782:
End Sub
 
End Module</langsyntaxhighlight>
{{out}}
<pre>banana
Line 2,811:
{{trans|Go}}
{{libheader|Wren-sort}}
<langsyntaxhighlight lang=ecmascript>import "/sort" for Sort
 
var stx = "\x02"
Line 2,868:
var r = ibwt.call(t)
System.print(" --> %(r)\n")
}</langsyntaxhighlight>
 
{{out}}
Line 2,898:
 
=={{header|zkl}}==
<langsyntaxhighlight lang=zkl>class BurrowsWheelerTransform{
fcn init(chr="$"){ var special=chr; }
fcn encode(str){
Line 2,914:
table.filter1("%s*".fmt(special).glob)[1,*]; // str[0]==$, often first element
}
}</langsyntaxhighlight>
<langsyntaxhighlight lang=zkl>BWT:=BurrowsWheelerTransform();
//BWT.encode("$"); // --> assert(bbb.zkl:25): String cannot contain char "$"
 
Line 2,924:
enc:=BWT.encode(test);
println("%s\n -->%s\n -->%s".fmt(test,enc,BWT.decode(enc)));
}</langsyntaxhighlight>
{{out}}
<pre>
10,327

edits