Tokenize a string

From Rosetta Code
Revision as of 04:52, 8 February 2007 by rosettacode>Planestraveler (→‎[[Java]]: added a "." to fully comply with the specification)
Task
Tokenize a string
You are encouraged to solve this task according to the task description, using any language you may know.

Separate the string "Hello,How,Are,You,Today" by commas into an array so that each index of the array stores a different word. Display the words to the 'user', in the simplest manner possible, separated by a period. To simplify, you may display a trailing period.


Java

Compiler: JDK 1.0 and up

There is multiple way to tokenized a string in Java. The first with a split the String into an array of String, and the other way to give a Enumerator. The second way given here will skip any empty token. So if two commas are given in line, there will be an empty string in the array given by the split function but no empty string with the StringTokenizer object.

String toTokenize = "Hello,How,Are,You,Today";

//First way
String word[] = toTokenize.split(",");
for(int i=0; i<word.length; i++) {
    System.out.print(word[i] + ".");
}
       
//Second way
StringTokenizer tokenizer = new StringTokenizer(toTokenize, ",");
while(tokenizer.hasMoreTokens()) {
    System.out.print(tokenizer.nextToken() + ".");
}

JavaScript

Interpreter: Firefox 2.0

var str = "Hello,How,Are,You,Today";
var tokens = str.split(",");
alert( tokens.join(".") );

Perl

Interpreter: Perl any 5.X

As a one liner without a trailing period, and most efficient way of doing it as you don't have to define an array.

print join('.', split(/,/, "Hello,How,Are,You,Today"));

If you needed to keep an array for later use, again no trailing period

my @words = split(/,/, "Hello,How,Are,You,Today");
print join('.', @words);

If you really want a trailing period, here is an example

my @words = split(/,/, "Hello,How,Are,You,Today");
print $_.'.' for (@words);

Python

Interpreter: Python 2.5

 words = "Hello,How,Are,You,Today".split(',')
 for word in words:
     print word

This prints each word on its own line. If we want to follow the task specification strictly, we join the array elements with a dot, then print the resulting string:

 print '.'.join("Hello,How,Are,You,Today".split(','))

Ruby

    string = "Hello,How,Are,You,Today".split(',')
    string.each do |w|
         print "#{w}."
    end

Tcl

Generating a list form a string by splitting on a comma:

 split string ,

Joining the elements of a list by a period:

 join list .

Thus the whole thing would look like this:

 puts [join [split "Hello,How,Are,You,Today" ,] .]

If you'd like to retain the list in a variable with the name "words", it would only be marginally more complex:

 puts [join [set words [split "Hello,How,Are,You,Today" ,]] .]