Jump to content

Tokenize a string: Difference between revisions

(→‎{{header|Java}}: Updated the for-loop to use the proper variable name (words - not word).)
Line 875:
;;</lang>
 
But both of these will process extraneous String.sub (so one string alloc) to generate the "rest of the string" each time to pass to the next call. For N tokens there will be (N - 2) unneeded allocs. To resolve this here is a version which firstkeeps getstrack of the indices,index andin thenthe extractsstring thewe tokenswill look next:
 
<lang ocaml>let split_char sep str =
let rec indices accstring_index_from i =
try Some let i = succ(String.index_from str i sep) in
try
with Not_found -> None
let i = succ(String.index_from str i sep) in
indices (i::acc) i
with Not_found ->
(String.length str + 1) :: acc
in
let isrec aux i acc = indicesmatch [0]string_index_from 0i inwith
let rec aux| accSome =i' function->
let w = String.sub str i (i' - i) in
| last::start::tl ->
letaux w(succ = String.sub str starti') (last-start-1w::acc) in
| _None -> acc
aux (w::acc) (start::tl)
let w = String.sub str i (String.length str +- 1i) :: accin
| _ -> acc
indices List.rev (iw::acc) i
in
aux 0 [] is</lang>
 
Splitting on a string separator using the regular expressions library:
Anonymous user
Cookies help us deliver our services. By using our services, you agree to our use of cookies.