ContentSpec really exists to aid the parser classes in implementing
access to the grammar.
This class is used by the DTD scanner and the validator classes,
allowing them to be used separately or together. This "struct"
class is used to build content models for validation, where it
is more efficient to fetch all of the information for each of
these content model "fragments" than to fetch each field one at
a time. Since configurations are allowed to have validators
without a DTD scanner (i.e. a schema validator) and a DTD scanner
without a validator (non-validating processor), this class can be
used by each without requiring the presence of the other.
When processing element declarations, the DTD scanner will build
up a representation of the content model using the node types that
are defined here. Since a non-validating processor only needs to
remember the type of content model declared (i.e. ANY, EMPTY, MIXED,
or CHILDREN), it is free to discard the specific details of the
MIXED and CHILDREN content models described using this class.
In the typical case of a validating processor reading the grammar
of the document from a DTD, the information about the content model
declared will be preserved and later "compiled" into an efficient
form for use during element validation. Each content spec node
that is saved is assigned a unique index that is used as a handle
for the "value" or "otherValue" fields of other content spec nodes.
A leaf node has a "value" that is either an index in the string
pool of the element type of that leaf, or a value of -1 to indicate
the special "#PCDATA" leaf type used in a mixed content model.
For a mixed content model, the content spec will be made up of
leaf and choice content spec nodes, with an optional "zero or more"
node. For example, the mixed content declaration "(#PCDATA)" would
contain a single leaf node with a node value of -1. A mixed content
declaration of "(#PCDATA|foo)*" would have a content spec consisting
of two leaf nodes, for the "#PCDATA" and "foo" choices, a choice node
with the "value" set to the index of the "#PCDATA" leaf node and the
"otherValue" set to the index of the "foo" leaf node, and a "zero or
more" node with the "value" set to the index of the choice node. If
the content model has more choices, for example "(#PCDATA|a|b)*", then
there will be more corresponding choice and leaf nodes, the choice
nodes will be chained together through the "value" field with each
leaf node referenced by the "otherValue" field.
For element content models, there are sequence nodes and also "zero or
one" and "one or more" nodes. The leaf nodes would always have a valid
string pool index, as the "#PCDATA" leaf is not used in the declarations
for element content models. |