Tokenize a string: Difference between revisions
Content added Content deleted
(→{{header|Java}}: Updated the for-loop to use the proper variable name (words - not word).) |
|||
Line 875: | Line 875: | ||
;;</lang> |
;;</lang> |
||
But both of these will process extraneous String.sub (so one string alloc). For N tokens there will be (N - 2) unneeded allocs. To resolve this here is a version which |
But both of these will process extraneous String.sub (so one string alloc) to generate the "rest of the string" each time to pass to the next call. For N tokens there will be (N - 2) unneeded allocs. To resolve this here is a version which keeps track of the index in the string we will look next: |
||
<lang ocaml>let split_char sep str = |
<lang ocaml>let split_char sep str = |
||
let |
let string_index_from i = |
||
⚫ | |||
try |
|||
⚫ | |||
⚫ | |||
⚫ | |||
⚫ | |||
⚫ | |||
in |
in |
||
let |
let rec aux i acc = match string_index_from i with |
||
| Some i' -> |
|||
let w = String.sub str i (i' - i) in |
|||
| last::start::tl -> |
|||
aux (succ i') (w::acc) |
|||
⚫ | |||
aux (w::acc) (start::tl) |
|||
⚫ | |||
⚫ | |||
⚫ | |||
in |
in |
||
aux [] |
aux 0 []</lang> |
||
Splitting on a string separator using the regular expressions library: |
Splitting on a string separator using the regular expressions library: |