Methods Summary |
---|
public java.util.regex.Matcher | appendReplacement(java.lang.StringBuffer sb, java.lang.String replacement)Implements a non-terminal append-and-replace step.
This method performs the following actions:
It reads characters from the input sequence, starting at the
append position, and appends them to the given string buffer. It
stops after reading the last character preceding the previous match,
that is, the character at index {@link
#start()} - 1.
It appends the given replacement string to the string buffer.
It sets the append position of this matcher to the index of
the last character matched, plus one, that is, to {@link #end()}.
The replacement string may contain references to subsequences
captured during the previous match: Each occurrence of
$g will be replaced by the result of
evaluating {@link #group(int) group}(g).
The first number after the $ is always treated as part of
the group reference. Subsequent numbers are incorporated into g if
they would form a legal group reference. Only the numerals '0'
through '9' are considered as potential components of the group
reference. If the second group matched the string "foo", for
example, then passing the replacement string "$2bar" would
cause "foobar" to be appended to the string buffer. A dollar
sign ($) may be included as a literal in the replacement
string by preceding it with a backslash (\$).
Note that backslashes (\) and dollar signs ($) in
the replacement string may cause the results to be different than if it
were being treated as a literal replacement string. Dollar signs may be
treated as references to captured subsequences as described above, and
backslashes are used to escape literal characters in the replacement
string.
This method is intended to be used in a loop together with the
{@link #appendTail appendTail} and {@link #find find} methods. The
following code, for example, writes one dog two dogs in the
yard to the standard-output stream:
Pattern p = Pattern.compile("cat");
Matcher m = p.matcher("one cat two cats in the yard");
StringBuffer sb = new StringBuffer();
while (m.find()) {
m.appendReplacement(sb, "dog");
}
m.appendTail(sb);
System.out.println(sb.toString());
// If no match, return error
if (first < 0)
throw new IllegalStateException("No match available");
// Process substitution string to replace group references with groups
int cursor = 0;
String s = replacement;
StringBuffer result = new StringBuffer();
while (cursor < replacement.length()) {
char nextChar = replacement.charAt(cursor);
if (nextChar == '\\") {
cursor++;
nextChar = replacement.charAt(cursor);
result.append(nextChar);
cursor++;
} else if (nextChar == '$") {
// Skip past $
cursor++;
// The first number is always a group
int refNum = (int)replacement.charAt(cursor) - '0";
if ((refNum < 0)||(refNum > 9))
throw new IllegalArgumentException(
"Illegal group reference");
cursor++;
// Capture the largest legal group string
boolean done = false;
while (!done) {
if (cursor >= replacement.length()) {
break;
}
int nextDigit = replacement.charAt(cursor) - '0";
if ((nextDigit < 0)||(nextDigit > 9)) { // not a number
break;
}
int newRefNum = (refNum * 10) + nextDigit;
if (groupCount() < newRefNum) {
done = true;
} else {
refNum = newRefNum;
cursor++;
}
}
// Append group
if (group(refNum) != null)
result.append(group(refNum));
} else {
result.append(nextChar);
cursor++;
}
}
// Append the intervening text
sb.append(getSubSequence(lastAppendPosition, first));
// Append the match substitution
sb.append(result.toString());
lastAppendPosition = last;
return this;
|
public java.lang.StringBuffer | appendTail(java.lang.StringBuffer sb)Implements a terminal append-and-replace step.
This method reads characters from the input sequence, starting at
the append position, and appends them to the given string buffer. It is
intended to be invoked after one or more invocations of the {@link
#appendReplacement appendReplacement} method in order to copy the
remainder of the input sequence.
sb.append(getSubSequence(lastAppendPosition, getTextLength()).toString());
return sb;
|
char | charAt(int i)Returns this Matcher's input character at index i.
return text.charAt(i);
|
public int | end()Returns the offset after the last character matched.
if (first < 0)
throw new IllegalStateException("No match available");
return last;
|
public int | end(int group)Returns the offset after the last character of the subsequence
captured by the given group during the previous match operation.
Capturing groups are indexed from left
to right, starting at one. Group zero denotes the entire pattern, so
the expression m.end(0) is equivalent to
m.end().
if (first < 0)
throw new IllegalStateException("No match available");
if (group > groupCount())
throw new IndexOutOfBoundsException("No group " + group);
return groups[group * 2 + 1];
|
public boolean | find()Attempts to find the next subsequence of the input sequence that matches
the pattern.
This method starts at the beginning of this matcher's region, or, if
a previous invocation of the method was successful and the matcher has
not since been reset, at the first character not matched by the previous
match.
If the match succeeds then more information can be obtained via the
start, end, and group methods.
int nextSearchIndex = last;
if (nextSearchIndex == first)
nextSearchIndex++;
// If next search starts before region, start it at region
if (nextSearchIndex < from)
nextSearchIndex = from;
// If next search starts beyond region then it fails
if (nextSearchIndex > to) {
for (int i = 0; i < groups.length; i++)
groups[i] = -1;
return false;
}
return search(nextSearchIndex);
|
public boolean | find(int start)Resets this matcher and then attempts to find the next subsequence of
the input sequence that matches the pattern, starting at the specified
index.
If the match succeeds then more information can be obtained via the
start, end, and group methods, and subsequent
invocations of the {@link #find()} method will start at the first
character not matched by this match.
int limit = getTextLength();
if ((start < 0) || (start > limit))
throw new IndexOutOfBoundsException("Illegal start index");
reset();
return search(start);
|
java.lang.CharSequence | getSubSequence(int beginIndex, int endIndex)Generates a String from this Matcher's input in the specified range.
return text.subSequence(beginIndex, endIndex);
|
int | getTextLength()Returns the end index of the text.
return text.length();
|
public java.lang.String | group()Returns the input subsequence matched by the previous match.
For a matcher m with input sequence s,
the expressions m.group() and
s.substring(m.start(), m.end())
are equivalent.
Note that some patterns, for example a*, match the empty
string. This method will return the empty string when the pattern
successfully matches the empty string in the input.
return group(0);
|
public java.lang.String | group(int group)Returns the input subsequence captured by the given group during the
previous match operation.
For a matcher m, input sequence s, and group index
g, the expressions m.group(g) and
s.substring(m.start(g), m.end(g))
are equivalent.
Capturing groups are indexed from left
to right, starting at one. Group zero denotes the entire pattern, so
the expression m.group(0) is equivalent to m.group().
If the match was successful but the group specified failed to match
any part of the input sequence, then null is returned. Note
that some groups, for example (a*), match the empty string.
This method will return the empty string when such a group successfully
matches the empty string in the input.
if (first < 0)
throw new IllegalStateException("No match found");
if (group < 0 || group > groupCount())
throw new IndexOutOfBoundsException("No group " + group);
if ((groups[group*2] == -1) || (groups[group*2+1] == -1))
return null;
return getSubSequence(groups[group * 2], groups[group * 2 + 1]).toString();
|
public int | groupCount()Returns the number of capturing groups in this matcher's pattern.
Group zero denotes the entire pattern by convention. It is not
included in this count.
Any non-negative integer smaller than or equal to the value
returned by this method is guaranteed to be a valid group index for
this matcher.
return parentPattern.capturingGroupCount - 1;
|
public boolean | hasAnchoringBounds()Queries the anchoring of region bounds for this matcher.
This method returns true if this matcher uses
anchoring bounds, false otherwise.
See {@link #useAnchoringBounds useAnchoringBounds} for a
description of anchoring bounds.
By default, a matcher uses anchoring region boundaries.
return anchoringBounds;
|
public boolean | hasTransparentBounds()Queries the transparency of region bounds for this matcher.
This method returns true if this matcher uses
transparent bounds, false if it uses opaque
bounds.
See {@link #useTransparentBounds useTransparentBounds} for a
description of transparent and opaque bounds.
By default, a matcher uses opaque region boundaries.
return transparentBounds;
|
public boolean | hitEnd()Returns true if the end of input was hit by the search engine in
the last match operation performed by this matcher.
When this method returns true, then it is possible that more input
would have changed the result of the last search.
return hitEnd;
|
public boolean | lookingAt()Attempts to match the input sequence, starting at the beginning of the
region, against the pattern.
Like the {@link #matches matches} method, this method always starts
at the beginning of the region; unlike that method, it does not
require that the entire region be matched.
If the match succeeds then more information can be obtained via the
start, end, and group methods.
return match(from, NOANCHOR);
|
boolean | match(int from, int anchor)Initiates a search for an anchored match to a Pattern within the given
bounds. The groups are filled with default values and the match of the
root of the state machine is called. The state machine will hold the
state of the match as it proceeds in this matcher.
this.hitEnd = false;
this.requireEnd = false;
from = from < 0 ? 0 : from;
this.first = from;
this.oldLast = oldLast < 0 ? from : oldLast;
for (int i = 0; i < groups.length; i++)
groups[i] = -1;
acceptMode = anchor;
boolean result = parentPattern.matchRoot.match(this, from, text);
if (!result)
this.first = -1;
this.oldLast = this.last;
return result;
|
public boolean | matches()Attempts to match the entire region against the pattern.
If the match succeeds then more information can be obtained via the
start, end, and group methods.
return match(from, ENDANCHOR);
|
public java.util.regex.Pattern | pattern()Returns the pattern that is interpreted by this matcher.
return parentPattern;
|
public static java.lang.String | quoteReplacement(java.lang.String s)Returns a literal replacement String for the specified
String .
This method produces a String that will work
as a literal replacement s in the
appendReplacement method of the {@link Matcher} class.
The String produced will match the sequence of characters
in s treated as a literal sequence. Slashes ('\') and
dollar signs ('$') will be given no special meaning.
if ((s.indexOf('\\") == -1) && (s.indexOf('$") == -1))
return s;
StringBuffer sb = new StringBuffer();
for (int i=0; i<s.length(); i++) {
char c = s.charAt(i);
if (c == '\\") {
sb.append('\\"); sb.append('\\");
} else if (c == '$") {
sb.append('\\"); sb.append('$");
} else {
sb.append(c);
}
}
return sb.toString();
|
public java.util.regex.Matcher | region(int start, int end)Sets the limits of this matcher's region. The region is the part of the
input sequence that will be searched to find a match. Invoking this
method resets the matcher, and then sets the region to start at the
index specified by the start parameter and end at the
index specified by the end parameter.
Depending on the transparency and anchoring being used (see
{@link #useTransparentBounds useTransparentBounds} and
{@link #useAnchoringBounds useAnchoringBounds}), certain constructs such
as anchors may behave differently at or around the boundaries of the
region.
if ((start < 0) || (start > getTextLength()))
throw new IndexOutOfBoundsException("start");
if ((end < 0) || (end > getTextLength()))
throw new IndexOutOfBoundsException("end");
if (start > end)
throw new IndexOutOfBoundsException("start > end");
reset();
from = start;
to = end;
return this;
|
public int | regionEnd()Reports the end index (exclusive) of this matcher's region.
The searches this matcher conducts are limited to finding matches
within {@link #regionStart regionStart} (inclusive) and
{@link #regionEnd regionEnd} (exclusive).
return to;
|
public int | regionStart()Reports the start index of this matcher's region. The
searches this matcher conducts are limited to finding matches
within {@link #regionStart regionStart} (inclusive) and
{@link #regionEnd regionEnd} (exclusive).
return from;
|
public java.lang.String | replaceAll(java.lang.String replacement)Replaces every subsequence of the input sequence that matches the
pattern with the given replacement string.
This method first resets this matcher. It then scans the input
sequence looking for matches of the pattern. Characters that are not
part of any match are appended directly to the result string; each match
is replaced in the result by the replacement string. The replacement
string may contain references to captured subsequences as in the {@link
#appendReplacement appendReplacement} method.
Note that backslashes (\) and dollar signs ($) in
the replacement string may cause the results to be different than if it
were being treated as a literal replacement string. Dollar signs may be
treated as references to captured subsequences as described above, and
backslashes are used to escape literal characters in the replacement
string.
Given the regular expression a*b, the input
"aabfooaabfooabfoob", and the replacement string
"-", an invocation of this method on a matcher for that
expression would yield the string "-foo-foo-foo-".
Invoking this method changes this matcher's state. If the matcher
is to be used in further matching operations then it should first be
reset.
reset();
boolean result = find();
if (result) {
StringBuffer sb = new StringBuffer();
do {
appendReplacement(sb, replacement);
result = find();
} while (result);
appendTail(sb);
return sb.toString();
}
return text.toString();
|
public java.lang.String | replaceFirst(java.lang.String replacement)Replaces the first subsequence of the input sequence that matches the
pattern with the given replacement string.
This method first resets this matcher. It then scans the input
sequence looking for a match of the pattern. Characters that are not
part of the match are appended directly to the result string; the match
is replaced in the result by the replacement string. The replacement
string may contain references to captured subsequences as in the {@link
#appendReplacement appendReplacement} method.
Note that backslashes (\) and dollar signs ($) in
the replacement string may cause the results to be different than if it
were being treated as a literal replacement string. Dollar signs may be
treated as references to captured subsequences as described above, and
backslashes are used to escape literal characters in the replacement
string.
Given the regular expression dog, the input
"zzzdogzzzdogzzz", and the replacement string
"cat", an invocation of this method on a matcher for that
expression would yield the string "zzzcatzzzdogzzz".
Invoking this method changes this matcher's state. If the matcher
is to be used in further matching operations then it should first be
reset.
if (replacement == null)
throw new NullPointerException("replacement");
StringBuffer sb = new StringBuffer();
reset();
if (find())
appendReplacement(sb, replacement);
appendTail(sb);
return sb.toString();
|
public boolean | requireEnd()Returns true if more input could change a positive match into a
negative one.
If this method returns true, and a match was found, then more
input could cause the match to be lost. If this method returns false
and a match was found, then more input might change the match but the
match won't be lost. If a match was not found, then requireEnd has no
meaning.
return requireEnd;
|
public java.util.regex.Matcher | reset()Resets this matcher.
Resetting a matcher discards all of its explicit state information
and sets its append position to zero. The matcher's region is set to the
default region, which is its entire character sequence. The anchoring
and transparency of this matcher's region boundaries are unaffected.
first = -1;
last = 0;
oldLast = -1;
for(int i=0; i<groups.length; i++)
groups[i] = -1;
for(int i=0; i<locals.length; i++)
locals[i] = -1;
lastAppendPosition = 0;
from = 0;
to = getTextLength();
return this;
|
public java.util.regex.Matcher | reset(java.lang.CharSequence input)Resets this matcher with a new input sequence.
Resetting a matcher discards all of its explicit state information
and sets its append position to zero. The matcher's region is set to
the default region, which is its entire character sequence. The
anchoring and transparency of this matcher's region boundaries are
unaffected.
text = input;
return reset();
|
boolean | search(int from)Initiates a search to find a Pattern within the given bounds.
The groups are filled with default values and the match of the root
of the state machine is called. The state machine will hold the state
of the match as it proceeds in this matcher.
Matcher.from is not set here, because it is the "hard" boundary
of the start of the search which anchors will set to. The from param
is the "soft" boundary of the start of the search, meaning that the
regex tries to match at that index but ^ won't match there. Subsequent
calls to the search methods start at a new "soft" boundary which is
the end of the previous match.
this.hitEnd = false;
this.requireEnd = false;
from = from < 0 ? 0 : from;
this.first = from;
this.oldLast = oldLast < 0 ? from : oldLast;
for (int i = 0; i < groups.length; i++)
groups[i] = -1;
acceptMode = NOANCHOR;
boolean result = parentPattern.root.match(this, from, text);
if (!result)
this.first = -1;
this.oldLast = this.last;
return result;
|
public int | start()Returns the start index of the previous match.
if (first < 0)
throw new IllegalStateException("No match available");
return first;
|
public int | start(int group)Returns the start index of the subsequence captured by the given group
during the previous match operation.
Capturing groups are indexed from left
to right, starting at one. Group zero denotes the entire pattern, so
the expression m.start(0) is equivalent to
m.start().
if (first < 0)
throw new IllegalStateException("No match available");
if (group > groupCount())
throw new IndexOutOfBoundsException("No group " + group);
return groups[group * 2];
|
public java.util.regex.MatchResult | toMatchResult()Returns the match state of this matcher as a {@link MatchResult}.
The result is unaffected by subsequent operations performed upon this
matcher.
Matcher result = new Matcher(this.parentPattern, text.toString());
result.first = this.first;
result.last = this.last;
result.groups = (int[])(this.groups.clone());
return result;
|
public java.lang.String | toString()Returns the string representation of this matcher. The
string representation of a Matcher contains information
that may be useful for debugging. The exact format is unspecified.
StringBuffer sb = new StringBuffer();
sb.append("java.util.regex.Matcher");
sb.append("[pattern=" + pattern());
sb.append(" region=");
sb.append(regionStart() + "," + regionEnd());
sb.append(" lastmatch=");
if ((first >= 0) && (group() != null)) {
sb.append(group());
}
sb.append("]");
return sb.toString();
|
public java.util.regex.Matcher | useAnchoringBounds(boolean b)Sets the anchoring of region bounds for this matcher.
Invoking this method with an argument of true will set this
matcher to use anchoring bounds. If the boolean
argument is false, then non-anchoring bounds will be
used.
Using anchoring bounds, the boundaries of this
matcher's region match anchors such as ^ and $.
Without anchoring bounds, the boundaries of this
matcher's region will not match anchors such as ^ and $.
By default, a matcher uses anchoring region boundaries.
anchoringBounds = b;
return this;
|
public java.util.regex.Matcher | usePattern(java.util.regex.Pattern newPattern)Changes the Pattern that this Matcher uses to
find matches with.
This method causes this matcher to lose information
about the groups of the last match that occurred. The
matcher's position in the input is maintained and its
last append position is unaffected.
if (newPattern == null)
throw new IllegalArgumentException("Pattern cannot be null");
parentPattern = newPattern;
// Reallocate state storage
int parentGroupCount = Math.max(newPattern.capturingGroupCount, 10);
groups = new int[parentGroupCount * 2];
locals = new int[newPattern.localCount];
for (int i = 0; i < groups.length; i++)
groups[i] = -1;
for (int i = 0; i < locals.length; i++)
locals[i] = -1;
return this;
|
public java.util.regex.Matcher | useTransparentBounds(boolean b)Sets the transparency of region bounds for this matcher.
Invoking this method with an argument of true will set this
matcher to use transparent bounds. If the boolean
argument is false, then opaque bounds will be used.
Using transparent bounds, the boundaries of this
matcher's region are transparent to lookahead, lookbehind,
and boundary matching constructs. Those constructs can see beyond the
boundaries of the region to see if a match is appropriate.
Using opaque bounds, the boundaries of this matcher's
region are opaque to lookahead, lookbehind, and boundary matching
constructs that may try to see beyond them. Those constructs cannot
look past the boundaries so they will fail to match anything outside
of the region.
By default, a matcher uses opaque bounds.
transparentBounds = b;
return this;
|