This class, whose name is an acronym for "Trivial API for XML", is a
container for a simple Parser class for parsing XML and its related Token,
TokenType and ParseException classes and constants.
TAX.Parser is a simple, lightweight pull-parser that is useful for a variety
of simple XML parsing tasks. Note, however, that it is more of a tokenizer
than a true parser and that the grammar it parses is not actually XML, but a
simplified subset of XML. The parser has (at least) these limitations:
It does not enforce well-formedness. For example, it does not require
tags to be properly nested.
It is not a validating parser, and does not read external DTDs
It does not parse the internal subset of the DOCTYPE tag, and cannot
recognize any entities defined there.
It is not namespace-aware
It does not handle entity or character references in attribute values,
not even pre-defined entities such as "
It strips all whitespace from the start and end of document text, which,
while useful for many documents, is not generally correct.
It makes no attempt to do error recovery. The results of calling next()
after a ParseException is thown are undefined.
It does not provide enough detail to reconstruct the source document
TAX.Parser always replaces entity references with their values, or throws
a Tax.ParseException if no replacement value is known. The parser coalesces
adjacent text and entities into a single TEXT token. CDATA sections are
also returned as TEXT tokens, but are not coalesced. |