Multisplit: Difference between revisions
m
syntax highlighting fixup automation
Thundergnat (talk | contribs) m (syntax highlighting fixup automation) |
|||
Line 28:
{{trans|Python}}
<
V lastmatch = 0
V i = 0
Line 48:
R matches
print(multisplit(‘a!===b=!=c’, [‘==’, ‘!=’, ‘=’]))</
{{out}}
Line 57:
=={{header|Ada}}==
multisplit.adb:
<
with Ada.Text_IO;
Line 137:
Pos := String_Lists.Next (Pos);
end loop;
end Multisplit;</
{{out}}
Line 144:
=={{header|ALGOL 68}}==
<
# MODE to hold the split results #
Line 209:
SPLITINFO token = test tokens[ t ];
print( ( "token: [", text OF token, "] at: ", whole( position OF token, 0 ), " delimiter: (", delimiter OF token, ")", newline ) )
OD</
{{out}}
<pre>
Line 221:
=={{header|Arturo}}==
<
{{out}}
Line 228:
=={{header|AutoHotkey}}==
<
Sep := ["==","!=", "="]
Res := StrSplit(Str, Sep)
Line 236:
for k, v in Sep
N .= (N?"|":"") "\Q" v "\E"
MsgBox % RegExReplace(str, "(.*?)(" N ")", "$1 {$2}")</
{{out}}
<pre>a,,b,,c
Line 242:
=={{header|AWK}}==
<syntaxhighlight lang="awk">
# syntax: GAWK -f MULTISPLIT.AWK
BEGIN {
Line 266:
exit(0)
}
</syntaxhighlight>
{{out}}
<pre>
Line 280:
=={{header|BBC BASIC}}==
<
sep$() = "==", "!=", "="
PRINT "String splits into:"
Line 303:
ENDIF
UNTIL m% = LEN(s$)
= o$ + """" + MID$(s$, p%) + """"</
{{out}}
<pre>
Line 314:
=={{header|Bracmat}}==
This is a surprisingly difficult task to solve in Bracmat, because in a naive solution using a alternating pattern ("=="|"!="|"=") the shorter pattern <code>"="</code> would have precedence over <code>"=="</code>. In the solution below the function <code>oneOf</code> iterates (by recursion) over the operators, trying to match the start of the current subject string <code>sjt</code> with one operator at a time, until success or reaching the end of the list with operators, whichever comes first. If no operator is found at the start of the current subject string, the variable <code>nonOp</code> is extended with one byte, thereby shifting the start of the current subject string one byte to the right. Then a new attempt is made to find an operator. This is repeated until either an operator is found, in which case the unparsed string is restricted to the part of the input after the found operator, or no operator is found, in which case the <code>whl</code> loop terminates.
<
= operator
. !arg:%?operator ?arg
Line 331:
& put$!unparsed
& put$\n
);</
{{out}}
<pre>a {!=} {==} b {=} {!=} c</pre>
Line 337:
=={{header|C}}==
What kind of silly parsing is this?
<
#include <string.h>
Line 360:
return 0;
}</
{{out}}<syntaxhighlight lang="text">a{!=}{==}b{=}{!=}c</
=={{header|C sharp}}==
Line 367:
'''Extra Credit Solution'''
<
using System.Collections.Generic;
using System.Linq;
Line 423:
}
}
}</
{{out}}
Line 431:
=={{header|C++}}==
using the Boost library tokenizer!
<
#include <boost/tokenizer.hpp>
#include <string>
Line 449:
std::cout << '\n' ;
return 0 ;
}</
{{out}}
<PRE>a b c</PRE>
=={{header|CoffeeScript}}==
<
multi_split = (text, separators) ->
# Split text up, using separators to break up text and discarding
Line 485:
console.log multi_split 'a!===b=!=c', ['==', '!=', '='] # [ 'a', '', 'b', '', 'c' ]
console.log multi_split '', ['whatever'] # [ '' ]
</syntaxhighlight>
=={{header|D}}==
<
string[] multiSplit(in string s, in string[] divisors) pure nothrow {
Line 525:
.join(" {} ")
.writeln;
}</
{{out}} (separator locations indicated by braces):
<pre>a {} {} b {} {} c</pre>
=={{header|Delphi}}==
{{libheader| System.SysUtils}}
<syntaxhighlight lang="delphi">
program Multisplit;
Line 544:
write(']');
readln;
end.</
{{out}}
<pre>["a" "" "b" "" "c" ]</pre>
=={{header|Elixir}}==
{{trans|Erlang}}
<
["a", "", "", "b", "", "c"]</
=={{header|Erlang}}==
Line 562:
If we ignore the "Extra Credit" requirements and skip 'ordered separators' condition (i.e. solving absolute different task), this is exactly what one of the overloads of .NET's <code>String.Split</code> method does. Using F# Interactive:
<
val it : string [] = [|"a"; ""; "b"; ""; "c"|]
> "a!===b=!=c".Split([|"="; "!="; "=="|], System.StringSplitOptions.None);;
val it : string [] = [|"a"; ""; ""; "b"; ""; "c"|]</
<code>System.StringSplitOptions.None</code> specifies that empty strings should be included in the result.
=={{header|Factor}}==
<
IN: rosetta-code.multisplit
Line 585:
length -rot cut-slice [ , ] dip swap tail-slice
] while 2drop ,
] { } make ;</
{{out}}
Line 594:
=={{header|FreeBASIC}}==
FreeBASIC does not have a built in 'split' function so we need to write one:
<
Sub Split(s As String, sepList() As String, result() As String, removeEmpty As Boolean = False, showSepInfo As Boolean = False)
Line 667:
Print
Print "Press any key to quit"
Sleep</
{{out}}
Line 688:
=={{header|Go}}==
<
import (
Line 713:
func main() {
fmt.Printf("%q\n", ms("a!===b=!=c", []string{"==", "!=", "="}))
}</
{{out}}
<pre>
Line 720:
=={{header|Haskell}}==
<
( genericLength,
intercalate,
Line 756:
"with [(string, delimiter, offset)]:",
show parsed
]</
{{out}}
<pre>split string:
Line 765:
Or as a fold:
<
import Data.Bool (bool)
Line 782:
in reverse $ (ts, [], length s) : ps
main :: IO ()
main = print $ multiSplit ["==", "!=", "="] "a!===b=!=c"</
{{Out}}
<pre>[("a","!=",1),("","==",3),("b","=",6),("","!=",7),("c","",10)]</pre>
=={{header|Icon}} and {{header|Unicon}}==
<
s := "a!===b=!=c"
# just list the tokens
Line 807:
procedure arb()
suspend .&subject[.&pos:&pos <- &pos to *&subject + 1]
end</
{{out}}
Line 814:
=={{header|J}}==
<
'begin sep'=. |:bs=. _,~/:~;(,.&.>i.@#) y I.@E.L:0 x NB.
len=. #@>y NB.
Line 827:
r=.r,.txt;(s{::y);b
end.
}}</
Explanation:
Line 839:
Example use:
<
┌──┬──┬─┬──┬─┐
│a │ │b│ │c│
Line 862:
├─┼──┼─┤
│1│2 │ │
└─┴──┴─┘</
=={{header|Java}}==
<
public class MultiSplit {
Line 897:
return result;
}
}</
<pre>Regex split:
Line 909:
Based on Ruby example.
{{libheader|Underscore.js}}
<
return text.replace(/[-[\]{}()*+?.,\\^$|#\s]/g, "\\$&");
}
Line 916:
var sep_regex = RegExp(_.map(seps, function(sep) { return RegExp.escape(sep); }).join('|'));
return string.split(sep_regex);
}</
===ES6===
Line 922:
{{Trans|Haskell}} (Multisplit by fold example)
<
/// Delimiter list -> String -> list of parts, delimiters, offsets
Line 1,040:
multiSplit(delims, strTest)
);
})();</
{{Out}}
<pre>[
Line 1,078:
Both helper functions could be made inner functions of the main function, but are kept separate here for clarity.
<
# a single character from the input string.
# The input should be a nonempty string, and delims should be
Line 1,124:
then .[0:length-1] + [ .[length-1] + $x ]
else . + [$x]
end ) ;</
'''Examples'''
("a!===b=!=c",
Line 1,135:
=={{header|Julia}}==
From REPL:
<
julia> split(s, r"==|!=|=")
5-element Array{SubString{String},1}:
Line 1,143:
""
"c"
</syntaxhighlight>
=={{header|Kotlin}}==
<
fun main(args: Array<String>) {
Line 1,176:
println("\nThe delimiters matched and the indices at which they occur are:")
println(matches)
}</
{{out}}
Line 1,189:
=={{header|Lua}}==
The function I've written here is really excessive for this task but it has historically been hard to find example code for a good Lua split function on the Internet. This one behaves the same way as Julia's Base.split and I've included a comment describing its precise operation.
<
Returns a table of substrings by splitting the given string on
occurrences of the given character delimiters, which may be specified
Line 1,241:
for k, v in pairs(multisplit) do
print(k, v)
end</
{{Out}}
<pre>Key Value
Line 1,254:
Code from BBC BASIC with little changes to fit in M2000.
<syntaxhighlight lang="m2000 interpreter">
Module CheckIt {
DIM sep$()
Line 1,284:
}
CheckIt
</syntaxhighlight>
=={{header|Mathematica}}/{{header|Wolfram Language}}==
Just use the built-in function "StringSplit":
<
{{Out}}
<pre>{a,,b,,c}</pre>
=={{header|MiniScript}}==
<
result = []
startPos = 0
Line 1,311:
end function
print parseSep("a!===b=!=c", ["==", "!=", "="])</
{{Out}}
<pre>["a", "{!=}", "", "{==}", "b", "{=}", "", "{!=}"]</pre>
=={{header|Nim}}==
<
iterator tokenize(text: string; sep: openArray[string]): tuple[token: string, isSep: bool] =
Line 1,334:
if isSep: stdout.write '{',token,'}'
else: stdout.write token
echo ""</
{{out}}
Line 1,341:
=={{header|Perl}}==
<
my ($sep, $string, %opt) = @_ ;
$sep = join '|', map quotemeta($_), @$sep;
Line 1,351:
print "\n";
print "'$_' " for multisplit ['==','!=','='], "a!===b=!=c", keep_separators => 1;
print "\n";</
{{Out}}
Line 1,360:
=={{header|Phix}}==
<!--<
<span style="color: #008080;">with</span> <span style="color: #008080;">javascript_semantics</span>
<span style="color: #008080;">procedure</span> <span style="color: #000000;">multisplit</span><span style="color: #0000FF;">(</span><span style="color: #004080;">string</span> <span style="color: #000000;">text</span><span style="color: #0000FF;">,</span> <span style="color: #004080;">sequence</span> <span style="color: #000000;">delims</span><span style="color: #0000FF;">)</span>
Line 1,384:
<span style="color: #000000;">multisplit</span><span style="color: #0000FF;">(</span><span style="color: #008000;">"a!===b=!=c"</span><span style="color: #0000FF;">,{</span><span style="color: #008000;">"=="</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"!="</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"="</span><span style="color: #0000FF;">})</span>
<!--</
{{out}}
<pre>
Line 1,395:
=={{header|PicoLisp}}==
<
(setq Sep (mapcar chop Sep))
(make
Line 1,413:
(println (multisplit "a!===b=!=c" '("==" "!=" "=")))
(println (multisplit "a!===b=!=c" '("=" "!=" "==")))</
{{out}}
<pre>("a" (1 "!=") NIL (3 "==") "b" (6 "=") NIL (7 "!=") "c")
Line 1,419:
=={{header|Pike}}==
<
array sep = ({"==", "!=", "=" });
Line 1,435:
result;
Result: ({"a", ({"!=", 1}), "", ({"==", 3}), "b", ({"=", 6}), "", ({"!=", 7}), "c"})</
=={{header|PowerShell}}==
<syntaxhighlight lang="powershell">
$string = "a!===b=!=c"
$separators = [regex]"(==|!=|=)"
Line 1,450:
$matchInfo
</syntaxhighlight>
{{Out}}
<pre>
Line 1,462:
=={{header|Prolog}}==
Works with SWI-Prolog.
<
{!},
[].
Line 1,502:
my_sort(<, (N, N1, _), (N, N2, _)) :-
N1 > N2.
</syntaxhighlight>
{{out}}
<pre>?- multisplit(['==', '!=', '='], 'ax!===b=!=c', Lst, []).
Line 1,511:
===Procedural===
Using regular expressions:
<
>>> def ms2(txt="a!===b=!=c", sep=["==", "!=", "="]):
if not txt or not sep:
Line 1,525:
['a', (1, 1), '', (0, 3), 'b', (2, 6), '', (1, 7), 'c']
>>> ms2(txt="a!===b=!=c", sep=["=", "!=", "=="])
['a', (1, 1), '', (0, 3), '', (0, 4), 'b', (0, 6), '', (1, 7), 'c']</
Not using regular expressions:
'''Inspired by C-version'''
<
lastmatch = i = 0
matches = []
Line 1,551:
>>> multisplit('a!===b=!=c', ['!=', '==', '='])
['a', (0, 1), (1, 3), 'b', (2, 6), (0, 7), 'c']
</syntaxhighlight>
'''Alternative version'''
<
return List.index(min(List))
Line 1,620:
S = "a!===b=!=c"
multisplit(S, ["==", "!=", "="]) # output: ['a', [1, 1], '', [0, 3], 'b', [2, 6], '', [1, 7], 'c']
multisplit(S, ["=", "!=", "=="]) # output: ['a', [1, 1], '', [0, 3], '', [0, 4], 'b', [0, 6], '', [1, 7], 'c']</
===Functional===
In terms of a fold (reduce), without use of regular expressions:
{{Works with|Python|3.7}}
<
Line 1,700:
# MAIN ---
if __name__ == '__main__':
main()</
{{Out}}
<pre>[('a', '!=', 1), ('', '==', 3), ('b', '=', 6), ('', '!=', 7), ('c', '', 10)]</pre>
Line 1,706:
=={{header|Racket}}==
<
#lang racket
(regexp-match* #rx"==|!=|=" "a!===b=!=c" #:gap-select? #t #:match-select values)
;; => '("a" ("!=") "" ("==") "b" ("=") "" ("!=") "c")
</syntaxhighlight>
=={{header|Raku}}==
(formerly Perl 6)
<syntaxhighlight lang="raku"
my @chunks = multisplit 'a!===b=!=c==d', < == != = >;
Line 1,724:
for grep Match, @chunks -> $s {
say "{$s.fmt: '%2s'} from {$s.from.fmt: '%2d'} to {$s.to.fmt: '%2d'}";
}</
{{out}}
<pre>("a", "!=", "", "==", "b", "=", "", "!=", "c", "==", "d")
Line 1,739:
=={{header|REXX}}==
<
parse arg $ /*obtain optional string from the C.L. */
if $='' then $= "a!===b=!=c" /*None specified? Then use the default*/
Line 1,764:
$=changestr(null, $, showNull) /* ··· showing of "null" chars. */
say 'new string:' $ /*now, display the new string to term. */
/*stick a fork in it, we're all done. */</
Some older REXXes don't have a '''changestr''' BIF, so one is included here ──► [[CHANGESTR.REX]].
<br><br>'''output''' when using the default input:
Line 1,773:
=={{header|Ring}}==
<
# Project : Multisplit
Line 1,783:
see "" + n + ": " + substr(str, 1, pos-1) + " Sep By: " + sep[n] + nl
next
</syntaxhighlight>
Output:
<pre>
Line 1,796:
The simple method, using a regular expression to split the text.
<
separators = ['==', '!=', '=']
Line 1,804:
p multisplit_simple(text, separators) # => ["a", "", "b", "", "c"]
</syntaxhighlight>
The version that also returns the information about the separations.
<
sep_regex = Regexp.union(separators)
separator_info = []
Line 1,825:
p multisplit(text, separators)
# => [["a", "", "b", "", "c"], [["!=", 1], ["==", 3], ["=", 6], ["!=", 7]]]</
Also demonstrating a method to rejoin the string given the separator information.
<
str = info[0].zip(info[1])[0..-2].inject("") {|str, (piece, (sep, idx))| str << piece << sep}
str << info[0].last
Line 1,835:
p multisplit_rejoin(multisplit(text, separators)) == text
# => true</
=={{header|Run BASIC}}==
<
sep$ = "=== != =! b =!="
Line 1,846:
split$ = word$(str$,1,theSep$)
print i;" ";split$;" Sep By: ";theSep$
wend</
{{out}}
<pre>1 a! Sep By: ===
Line 1,855:
=={{header|Scala}}==
<
def multiSplit(str:String, sep:Seq[String])={
def findSep(index:Int)=sep find (str startsWith (_, index))
Line 1,870:
}
println(multiSplit("a!===b=!=c", Seq("!=", "==", "=")))</
{{out}}
<pre>List(a, , b, , c)</pre>
Line 1,876:
=={{header|Scheme}}==
{{works with|Gauche Scheme}}
<
(use srfi-42)
Line 1,890:
(define (glean shards)
(list-ec (: x (index i) shards)
(if (even? i)) x))</
<b>Testing:</b>
<pre>
Line 1,903:
First approach, using line delimiters. Lines are delimited by an array of separator strings, normally [CRLF, LF, CR, lineSeparator(0x2028), paragraphSeparator(0x2029)]. Supplying an alternate set of delimiters lets us split a string by a different (ordered) set of strings:
<
set separators to ["==", "!=", "="]
put each line delimited by separators of source</
Output:
<syntaxhighlight lang
Second approach, using a pattern. SenseTalk's pattern language lets us define a pattern (a regex) which can then be used to split the string and also to display the actual separators that were found.
<
set separatorPattern to <"==" or "!=" or "=">
Line 1,917:
put each occurrence of separatorPattern in source
</syntaxhighlight>
Output:
<
(!=,==,=,!=)</
=={{header|Sidef}}==
<
sep = sep.map{.escape}.join('|');
var re = Regex.new(keep_sep ? "(#{sep})" : sep);
Line 1,931:
[false, true].each { |bool|
say multisplit(%w(== != =), 'a!===b=!=c', keep_sep: bool);
}</
{{out}}
<pre>
Line 1,944:
{{trans|Python}}
<
func multiSplit(on seps: [String]) -> ([Substring], [(String, (start: String.Index, end: String.Index))]) {
var matches = [Substring]()
Line 1,979:
let (matches, matchedSeps) = "a!===b=!=c".multiSplit(on: ["==", "!=", "="])
print(matches, matchedSeps.map({ $0.0 }))</
Line 1,988:
=={{header|Tcl}}==
This simple version does not retain information about what the separators were:
<
set map {}; foreach s $sep {lappend map $s "\uffff"}
return [split [string map $map $text] "\uffff"]
}
puts [simplemultisplit "a!===b=!=c" {"==" "!=" "="}]</
{{out}}
<pre>a {} b {} c</pre>
Line 1,999:
to the match information (because the two collections of information
are of different lengths).
<
foreach s $sep {lappend sr [regsub -all {\W} $s {\\&}]}
set sepRE [join $sr "|"]
Line 2,012:
}
return [list [lappend pieces [string range $text $start end]] $match]
}</
Demonstration code:
<
set matchers {"==" "!=" "="}
lassign [multisplit $input $matchers] substrings matchinfo
puts $substrings
puts $matchinfo</
{{out}}
<pre>
Line 2,035:
The <code>:gap 0</code> makes the horizontal collect repetitions strictly adjacent. This means that <code>coll</code> will quit when faced with a nonmatching suffix portion of the data rather than scan forward (no gap allowed!). This creates an opportunity for the <code>tail</code> variable to grab the suffix which remains, which may be an empty string.
<
@(coll :gap 0)@(choose :shortest tok)@\
@tok@{sep /==/}@\
Line 2,045:
@(output)
@(rep)"@tok" {@sep} @(end)"@tail"
@(end)</
Runs:
Line 2,070:
{{trans|Racket}}
<
("a" "!=" "" "==" "b" "=" "" "!=" "c")</
Here the third boolean argument means "keep the material between the tokens", which in the Racket version seems to be requested by the argument <code>#:gap-select? #:t</code>.
Line 2,077:
=={{header|UNIX Shell}}==
{{works with|bash}}
<
local str=$1
shift
Line 2,105:
if [[ $original == $recreated ]]; then
echo "successfully able to recreate original string"
fi</
{{out}}
Line 2,116:
=={{header|VBScript}}==
<syntaxhighlight lang="vb">
Function multisplit(s,sep)
arr_sep = Split(sep,"|")
Line 2,143:
WScript.StdOut.WriteLine
WScript.StdOut.Write "Extra Credit: " & multisplit_extra("a!===b=!=c","!=|==|=")
WScript.StdOut.WriteLine</
{{out}}
<pre>
Line 2,152:
=={{header|Vlang}}==
Without using additional libraries or regular expressions:
<
str := "a!===b=!=c"
sep := ["==","!=","="]
Line 2,192:
println('Extra: $extra')
return place, ans, extra
}</
{{out}}
<pre>
Line 2,204:
{{libheader|Wren-pattern}}
{{libheader|Wren-fmt}}
<
import "/fmt" for Fmt
Line 2,216:
var parts = p.splitAll(input)
System.print("\nThe substrings between the separators are:")
System.print(parts.map { |p| (p != "") ? Fmt.q(p) : "empty string" }.toList)</
{{out}}
Line 2,231:
=={{header|Yabasic}}==
<
s$ = "==,!=,="
Line 2,249:
print left$(t$, l - 1), " with separator ", n$(j)
t$ = right$(t$, len(t$) - (l + len(n$(j))) + 1)
loop</
=={{header|zkl}}==
{{trans|Python}}
<
lastmatch := i := 0; matches := List();
while(i < text.len()){
Line 2,269:
if(i > lastmatch) matches.append(text[lastmatch,i-lastmatch]);
return(matches);
}</
<
multisplit("a!===b=!=c", T("!=", "==", "=")).println();</
{{out}}
<pre>
|