This class implements the DOM's NodeList behavior for
Element.getElementsByTagName()
The DOM describes NodeList as follows:
1) It may represent EITHER nodes scattered through a subtree (when
returned by Element.getElementsByTagName), or just the immediate
children (when returned by Node.getChildNodes). The latter is easy,
but the former (which this class addresses) is more challenging.
2) Its behavior is "live" -- that is, it always reflects the
current state of the document tree. To put it another way, the
NodeLists obtained before and after a series of insertions and
deletions are effectively identical (as far as the user is
concerned, the former has been dynamically updated as the changes
have been made).
3) Its API accesses individual nodes via an integer index, with the
listed nodes numbered sequentially in the order that they were
found during a preorder depth-first left-to-right search of the tree.
(Of course in the case of getChildNodes, depth is not involved.) As
nodes are inserted or deleted in the tree, and hence the NodeList,
the numbering of nodes that follow them in the NodeList will
change.
It is rather painful to support the latter two in the
getElementsByTagName case. The current solution is for Nodes to
maintain a change count (eventually that may be a Digest instead),
which the NodeList tracks and uses to invalidate itself.
Unfortunately, this does _not_ respond efficiently in the case that
the dynamic behavior was supposed to address: scanning a tree while
it is being extended. That requires knowing which subtrees have
changed, which can become an arbitrarily complex problem.
We save some work by filling the vector only as we access the
item()s... but I suspect the same users who demanded index-based
access will also start by doing a getLength() to control their loop,
blowing this optimization out of the water.
NOTE: Level 2 of the DOM will probably _not_ use NodeList for its
extended search mechanisms, partly for the reasons just discussed. |