Patternpublic final class Pattern extends Object implements SerializableRepresents a pattern used for matching, searching, or replacing strings.
{@code Pattern}s are specified in terms of regular expressions and compiled
using an instance of this class. They are then used in conjunction with a
{@link Matcher} to perform the actual search.
A typical use case looks like this:
Pattern p = Pattern.compile("Hello, A[a-z]*!");
Matcher m = p.matcher("Hello, Android!");
boolean b1 = m.matches(); // true
m.setInput("Hello, Robot!");
boolean b2 = m.matches(); // false
The above code could also be written in a more compact fashion, though this
variant is less efficient, since {@code Pattern} and {@code Matcher} objects
are created on the fly instead of being reused.
fashion:
boolean b1 = Pattern.matches("Hello, A[a-z]*!", "Hello, Android!"); // true
boolean b2 = Pattern.matches("Hello, A[a-z]*!", "Hello, Robot!"); // false
Please consult the package documentation for an
overview of the regular expression syntax used in this class as well as
Android-specific implementation details. |
Fields Summary |
---|
private static final long | serialVersionUID | public static final int | UNIX_LINESThis constant specifies that a pattern matches Unix line endings ('\n')
only against the '.', '^', and '$' meta characters. | public static final int | CASE_INSENSITIVEThis constant specifies that a {@code Pattern} is matched
case-insensitively. That is, the patterns "a+" and "A+" would both match
the string "aAaAaA".
Note: For Android, the {@code CASE_INSENSITIVE} constant
(currently) always includes the meaning of the {@link #UNICODE_CASE}
constant. So if case insensitivity is enabled, this automatically extends
to all Unicode characters. The {@code UNICODE_CASE} constant itself has
no special consequences. | public static final int | COMMENTSThis constant specifies that a {@code Pattern} may contain whitespace or
comments. Otherwise comments and whitespace are taken as literal
characters. | public static final int | MULTILINEThis constant specifies that the meta characters '^' and '$' match only
the beginning and end end of an input line, respectively. Normally, they
match the beginning and the end of the complete input. | public static final int | LITERALThis constant specifies that the whole {@code Pattern} is to be taken
literally, that is, all meta characters lose their meanings. | public static final int | DOTALLThis constant specifies that the '.' meta character matches arbitrary
characters, including line endings, which is normally not the case. | public static final int | UNICODE_CASEThis constant specifies that a {@code Pattern} is matched
case-insensitively with regard to all Unicode characters. It is used in
conjunction with the {@link #CASE_INSENSITIVE} constant to extend its
meaning to all Unicode characters.
Note: For Android, the {@code CASE_INSENSITIVE} constant
(currently) always includes the meaning of the {@code UNICODE_CASE}
constant. So if case insensitivity is enabled, this automatically extends
to all Unicode characters. The {@code UNICODE_CASE} constant then has no
special consequences. | public static final int | CANON_EQThis constant specifies that a character in a {@code Pattern} and a
character in the input string only match if they are canonically
equivalent. It is (currently) not supported in Android. | private String | patternHolds the regular expression. | private int | flagsHolds the flags used when compiling this pattern. | transient int | mNativePatternHolds a handle (a pointer, actually) for the native ICU pattern. | transient int | mGroupCountHolds the number of groups in the pattern. |
Constructors Summary |
---|
private Pattern(String pattern, int flags)Creates a new {@code Pattern} instance from a given regular expression
and flags.
if ((flags & CANON_EQ) != 0) {
throw new UnsupportedOperationException("CANON_EQ flag not supported");
}
this.pattern = pattern;
this.flags = flags;
compileImpl(pattern, flags);
|
Methods Summary |
---|
public static java.util.regex.Pattern | compile(java.lang.String pattern)Compiles a regular expression, creating a new Pattern instance in the
process. This is actually a convenience method that calls {@link
#compile(String, int)} with a {@code flags} value of zero.
return new Pattern(pattern, 0);
| public static java.util.regex.Pattern | compile(java.lang.String pattern, int flags)Compiles a regular expression, creating a new {@code Pattern} instance in
the process. Allows to set some flags that modify the behavior of the
{@code Pattern}.
return new Pattern(pattern, flags);
| private void | compileImpl(java.lang.String pattern, int flags)Compiles the given regular expression using the given flags. Used
internally only.
if (pattern == null) {
throw new NullPointerException();
}
if ((flags & LITERAL) != 0) {
pattern = quote(pattern);
}
// These are the flags natively supported by ICU.
// They even have the same value in native code.
flags = flags & (CASE_INSENSITIVE | COMMENTS | MULTILINE | DOTALL | UNIX_LINES);
mNativePattern = NativeRegEx.open(pattern, flags);
mGroupCount = NativeRegEx.groupCount(mNativePattern);
| protected void | finalize()
try {
if (mNativePattern != 0) {
NativeRegEx.close(mNativePattern);
}
}
finally {
super.finalize();
}
| public int | flags()Returns the flags that have been set for this {@code Pattern}.
return flags;
| public java.util.regex.Matcher | matcher(java.lang.CharSequence input)Returns a {@link Matcher} for the {@code Pattern} and a given input. The
{@code Matcher} can be used to match the {@code Pattern} against the
whole input, find occurrences of the {@code Pattern} in the input, or
replace parts of the input.
return new Matcher(this, input);
| public static boolean | matches(java.lang.String regex, java.lang.CharSequence input)Tries to match a given regular expression against a given input. This is
actually nothing but a convenience method that compiles the regular
expression into a {@code Pattern}, builds a {@link Matcher} for it, and
then does the match. If the same regular expression is used for multiple
operations, it is recommended to compile it into a {@code Pattern}
explicitly and request a reusable {@code Matcher}.
return new Matcher(new Pattern(regex, 0), input).matches();
| public java.lang.String | pattern()Returns the regular expression that was compiled into this
{@code Pattern}.
return pattern;
| public static java.lang.String | quote(java.lang.String s)Quotes a given string using "\Q" and "\E", so that all other
meta-characters lose their special meaning. If the string is used for a
{@code Pattern} afterwards, it can only be matched literally.
StringBuffer sb = new StringBuffer().append("\\Q");
int apos = 0;
int k;
while ((k = s.indexOf("\\E", apos)) >= 0) {
sb.append(s.substring(apos, k + 2)).append("\\\\E\\Q");
apos = k + 2;
}
return sb.append(s.substring(apos)).append("\\E").toString();
| private void | readObject(java.io.ObjectInputStream s)Provides serialization support
s.defaultReadObject();
compileImpl(pattern, flags);
| public java.lang.String[] | split(java.lang.CharSequence inputSeq, int limit)Splits the given input sequence around occurrences of the {@code Pattern}.
The function first determines all occurrences of the {@code Pattern}
inside the input sequence. It then builds an array of the
"remaining" strings before, in-between, and after these
occurrences. An additional parameter determines the maximal number of
entries in the resulting array and the handling of trailing empty
strings.
int maxLength = limit <= 0 ? Integer.MAX_VALUE : limit;
String input = inputSeq.toString();
ArrayList<String> list = new ArrayList<String>();
Matcher matcher = new Matcher(this, inputSeq);
int savedPos = 0;
// Add text preceding each occurrence, if enough space. Only do this for
// non-empty input sequences, because otherwise we'd add the "trailing
// empty string" twice.
if (inputSeq.length() != 0) {
while(matcher.find() && list.size() + 1 < maxLength) {
list.add(input.substring(savedPos, matcher.start()));
savedPos = matcher.end();
}
}
// Add trailing text if enough space.
if (list.size() < maxLength) {
if (savedPos < input.length()) {
list.add(input.substring(savedPos));
} else {
list.add("");
}
}
// Remove trailing spaces, if limit == 0 is requested.
if (limit == 0) {
int i = list.size() - 1;
// Don't remove 1st element, since array must not be empty.
while(i > 0 && "".equals(list.get(i))) {
list.remove(i);
i--;
}
}
return list.toArray(new String[list.size()]);
| public java.lang.String[] | split(java.lang.CharSequence input)Splits a given input around occurrences of a regular expression. This is
a convenience method that is equivalent to calling the method
{@link #split(java.lang.CharSequence, int)} with a limit of 0.
return split(input, 0);
| public java.lang.String | toString()
return pattern;
|
|