As we've seen, the finite state machines used to evaluate a sequence of elements against the grammar rules for a complex type are constructed by the schema compiler and embedded in the SCM file that is used as input to the validator.
A simplified validator for a simple finite state machine could be written like this:
<xsl:iterate select="$node/*"> <xsl:param name="state" select="$initial-state" as="element(scm:state)"/> <xsl:on-completion> <xsl:if test="not($state/@final = 'true')"> <xsl:sequence select="map{'errors': scm:error($node, 'Element content is incomplete')}"/> </xsl:if> </xsl:on-completion> <xsl:variable name="matching-edge" as="element(scm:edge)?" select="$state/scm:edge[scm:get(@term)[@name = local-name(current()) and string(@targetNamespace) = namespace-uri(current())]]"/> <xsl:variable name="matching-wildcard-edge" as="element(scm:edge)?" select="$state/scm:edge[scm:get(@term)[self::scm:wildcard[ scm:wildcard-matches($containing-type, ., current())]]]"/> <xsl:choose> <xsl:when test="empty($matching-edge) and empty($matching-wildcard-edge)"> <xsl:break select="map{'errors': scm:error(., 'Element ' || name() || ' is not allowed here')}"/> </xsl:when> <xsl:when test="empty($matching-edge)"> <xsl:variable name="wildcard" select="scm:get($matching-wildcard-edge/@term)" as="element(scm:wildcard)?"/> <xsl:sequence select="scm:check-wildcard-match($containing-type, $wildcard, .)"/> <xsl:next-iteration> <xsl:with-param name="state" select="$states[@nr = $matching-wildcard-edge/@to]"/> </xsl:next-iteration> </xsl:when> <xsl:otherwise> <xsl:variable name="decl" select="scm:get($matching-edge/@term)" as="element(scm:element)"/> <xsl:apply-templates select="." mode="explicit-decl"> <xsl:with-param name="decl" select="$decl"/> </xsl:apply-templates> <xsl:next-iteration> <xsl:with-param name="state" select="$states[@nr = $matching-edge/@to]"/> </xsl:next-iteration> </xsl:otherwise> </xsl:choose> </xsl:iterate>
The way this code works is as follows:
The xsl:iterate
instruction is new in XSLT 3.0. It is rather like xsl:for-each
,
except that it processes the selected items strictly in sequence; the code for processing one item can set
parameters for processing the next item; and it is possible to break out of the loop early. The same effect
could be achieved with a recursive template, but xsl:iterate
is often easier to understand.
In this case we are iterating over the children of the element being validated.
There is a single parameter, the current state, which is initially set (by the calling code) to the state numbered 0.
The xsl:on-completion
instruction is executed when we reach the end of the sequence of
child elements. If the current state is a final state, we return nothing (meaning all is well, the input is valid).
Otherwise we return a map containing an error value.
There are two kinds of transition possible in a given state: named element transitions, and wildcard transitions. We first find all the matching named element transitions (the schema compiler will have ensured there can be at most one) and all the matching wildcard transitions.
If both sets are empty, there is no legal transition for the current child element in this state, so we return an error value.
If there is a wildcard transition possible, but no named-element transition, then we check that the wildcard
transition is really allowed and that the element is valid against the wildcard (this will take account of its
processContents
attribute, and then proceed to process the next child element in the state reached by
this transition.
If there is a named-element transition possible, then we call apply-templates
to check that
the child element is valid against the required type for the named element, and then proceed to process
the next child element in the state reached by this transition.
The actual logic is more complex than this. Firstly, we use a finite state machine with counters, to reduce the
size of the finite state machine needed for a grammar such as <element name="book" minOccurs="100" maxOccurs="200"/>
.
Secondly, XSD 1.1 allows "open content" which allows elements matching a given wildcard to appear either (a) anywhere (interleaved content),
or (b) at the end of the sequence (suffix content). The possibility of open content is not integrated into the finite state machine,
but is instead handled by the validator as it arises. However, the basic principle is retained of stepping through the children using
xsl:iterate
to maintain the current state.