The {@code StringTokenizer} class allows an application to break a string
into tokens by performing code point comparison. The {@code StringTokenizer}
methods do not distinguish among identifiers, numbers, and quoted strings,
nor do they recognize and skip comments.
The set of delimiters (the codepoints that separate tokens) may be specified
either at creation time or on a per-token basis.
An instance of {@code StringTokenizer} behaves in one of three ways,
depending on whether it was created with the {@code returnDelimiters} flag
having the value {@code true} or {@code false}:
- If returnDelims is {@code false}, delimiter code points serve to separate
tokens. A token is a maximal sequence of consecutive code points that are not
delimiters.
- If returnDelims is {@code true}, delimiter code points are themselves
considered to be tokens. In this case a token will be received for each
delimiter code point.
A token is thus either one delimiter code point, or a maximal sequence of
consecutive code points that are not delimiters.
A {@code StringTokenizer} object internally maintains a current position
within the string to be tokenized. Some operations advance this current
position past the code point processed.
A token is returned by taking a substring of the string that was used to
create the {@code StringTokenizer} object.
Here's an example of the use of the default delimiter {@code StringTokenizer}
:
StringTokenizer st = new StringTokenizer("this is a test");
while (st.hasMoreTokens()) {
println(st.nextToken());
}
This prints the following output:
this
is
a
test
Here's an example of how to use a {@code StringTokenizer} with a user
specified delimiter:
StringTokenizer st = new StringTokenizer(
"this is a test with supplementary characters \ud800\ud800\udc00\udc00",
" \ud800\udc00");
while (st.hasMoreTokens()) {
println(st.nextToken());
}
This prints the following output:
this
is
a
test
with
supplementary
characters
\ud800
\udc00
|