Fields Summary |
---|
public static final String | HTTP_SEPARATORSThe HTTP separator characters. Defined in RFC 2616, section 2.2. |
protected final HeaderIterator | headerItThe iterator from which to obtain the next header. |
protected String | currentHeaderThe value of the current header.
This is the header value that includes {@link #currentToken}.
Undefined if the iteration is over. |
protected String | currentTokenThe token to be returned by the next call to {@link #currentToken}.
null if the iteration is over. |
protected int | searchPosThe position after {@link #currentToken} in {@link #currentHeader}.
Undefined if the iteration is over. |
Methods Summary |
---|
protected java.lang.String | createToken(java.lang.String value, int start, int end)Creates a new token to be returned.
Called from {@link #findNext findNext} after the token is identified.
The default implementation simply calls
{@link java.lang.String#substring String.substring}.
If header values are significantly longer than tokens, and some
tokens are permanently referenced by the application, there can
be problems with garbage collection. A substring will hold a
reference to the full characters of the original string and
therefore occupies more memory than might be expected.
To avoid this, override this method and create a new string
instead of a substring.
return value.substring(start, end);
|
protected int | findNext(int from)Determines the next token.
If found, the token is stored in {@link #currentToken}.
The return value indicates the position after the token
in {@link #currentHeader}. If necessary, the next header
will be obtained from {@link #headerIt}.
If not found, {@link #currentToken} is set to null .
if (from < 0) {
// called from the constructor, initialize the first header
if (!this.headerIt.hasNext()) {
return -1;
}
this.currentHeader = this.headerIt.nextHeader().getValue();
from = 0;
} else {
// called after a token, make sure there is a separator
from = findTokenSeparator(from);
}
int start = findTokenStart(from);
if (start < 0) {
this.currentToken = null;
return -1; // nothing found
}
int end = findTokenEnd(start);
this.currentToken = createToken(this.currentHeader, start, end);
return end;
|
protected int | findTokenEnd(int from)Determines the ending position of the current token.
This method will not leave the current header value,
since the end of the header value is a token boundary.
if (from < 0) {
throw new IllegalArgumentException
("Token start position must not be negative: " + from);
}
final int to = this.currentHeader.length();
int end = from+1;
while ((end < to) && isTokenChar(this.currentHeader.charAt(end))) {
end++;
}
return end;
|
protected int | findTokenSeparator(int from)Determines the position of the next token separator.
Because of multi-header joining rules, the end of a
header value is a token separator. This method does
therefore not need to iterate over headers.
if (from < 0) {
throw new IllegalArgumentException
("Search position must not be negative: " + from);
}
boolean found = false;
final int to = this.currentHeader.length();
while (!found && (from < to)) {
final char ch = this.currentHeader.charAt(from);
if (isTokenSeparator(ch)) {
found = true;
} else if (isWhitespace(ch)) {
from++;
} else if (isTokenChar(ch)) {
throw new ParseException
("Tokens without separator (pos " + from +
"): " + this.currentHeader);
} else {
throw new ParseException
("Invalid character after token (pos " + from +
"): " + this.currentHeader);
}
}
return from;
|
protected int | findTokenStart(int from)Determines the starting position of the next token.
This method will iterate over headers if necessary.
if (from < 0) {
throw new IllegalArgumentException
("Search position must not be negative: " + from);
}
boolean found = false;
while (!found && (this.currentHeader != null)) {
final int to = this.currentHeader.length();
while (!found && (from < to)) {
final char ch = this.currentHeader.charAt(from);
if (isTokenSeparator(ch) || isWhitespace(ch)) {
// whitspace and token separators are skipped
from++;
} else if (isTokenChar(this.currentHeader.charAt(from))) {
// found the start of a token
found = true;
} else {
throw new ParseException
("Invalid character before token (pos " + from +
"): " + this.currentHeader);
}
}
if (!found) {
if (this.headerIt.hasNext()) {
this.currentHeader = this.headerIt.nextHeader().getValue();
from = 0;
} else {
this.currentHeader = null;
}
}
} // while headers
return found ? from : -1;
|
public boolean | hasNext()
return (this.currentToken != null);
|
protected boolean | isHttpSeparator(char ch)Checks whether a character is an HTTP separator.
The implementation in this class checks only for the HTTP separators
defined in RFC 2616, section 2.2. If you need to detect other
separators beyond the US-ASCII character set, override this method.
return (HTTP_SEPARATORS.indexOf(ch) >= 0);
|
protected boolean | isTokenChar(char ch)Checks whether a character is a valid token character.
Whitespace, control characters, and HTTP separators are not
valid token characters. The HTTP specification (RFC 2616, section 2.2)
defines tokens only for the US-ASCII character set, this
method extends the definition to other character sets.
// common sense extension of ALPHA + DIGIT
if (Character.isLetterOrDigit(ch))
return true;
// common sense extension of CTL
if (Character.isISOControl(ch))
return false;
// no common sense extension for this
if (isHttpSeparator(ch))
return false;
// RFC 2616, section 2.2 defines a token character as
// "any CHAR except CTLs or separators". The controls
// and separators are included in the checks above.
// This will yield unexpected results for Unicode format characters.
// If that is a problem, overwrite isHttpSeparator(char) to filter
// out the false positives.
return true;
|
protected boolean | isTokenSeparator(char ch)Checks whether a character is a token separator.
RFC 2616, section 2.1 defines comma as the separator for
#token sequences. The end of a header value will
also separate tokens, but that is not a character check.
return (ch == ',");
|
protected boolean | isWhitespace(char ch)Checks whether a character is a whitespace character.
RFC 2616, section 2.2 defines space and horizontal tab as whitespace.
The optional preceeding line break is irrelevant, since header
continuation is handled transparently when parsing messages.
// we do not use Character.isWhitspace(ch) here, since that allows
// many control characters which are not whitespace as per RFC 2616
return ((ch == '\t") || Character.isSpaceChar(ch));
|
public final java.lang.Object | next()Returns the next token.
Same as {@link #nextToken}, but with generic return type.
return nextToken();
|
public java.lang.String | nextToken()Obtains the next token from this iteration.
if (this.currentToken == null) {
throw new NoSuchElementException("Iteration already finished.");
}
final String result = this.currentToken;
// updates currentToken, may trigger ParseException:
this.searchPos = findNext(this.searchPos);
return result;
|
public final void | remove()Removing tokens is not supported.
throw new UnsupportedOperationException
("Removing tokens is not supported.");
|