Assessment against Complex Types using Finite State Machines

As we've seen, the finite state machines used to evaluate a sequence of elements against the grammar rules for a complex type are constructed by the schema compiler and embedded in the SCM file that is used as input to the validator.

A simplified validator for a simple finite state machine could be written like this:

<xsl:iterate select="$node/*">
    <xsl:param name="state" select="$initial-state" as="element(scm:state)"/>
    <xsl:on-completion>
        <xsl:if test="not($state/@final = 'true')">
            <xsl:sequence select="map{'errors': 
                                      scm:error($node, 'Element content is 
                                      incomplete')}"/>
        </xsl:if>
    </xsl:on-completion>
    <xsl:variable name="matching-edge" as="element(scm:edge)?"
        select="$state/scm:edge[scm:get(@term)[@name = local-name(current()) 
                   and string(@targetNamespace) = namespace-uri(current())]]"/>
    <xsl:variable name="matching-wildcard-edge" as="element(scm:edge)?"
        select="$state/scm:edge[scm:get(@term)[self::scm:wildcard[
                                  scm:wildcard-matches($containing-type, ., 
                                  current())]]]"/>
    <xsl:choose>
        <xsl:when test="empty($matching-edge) and empty($matching-wildcard-edge)">
             <xsl:break select="map{'errors': scm:error(., 'Element ' || name()
                                                  || ' is not allowed here')}"/>
        </xsl:when>
        <xsl:when test="empty($matching-edge)">
            <xsl:variable name="wildcard" 
                          select="scm:get($matching-wildcard-edge/@term)" 
                          as="element(scm:wildcard)?"/>
            <xsl:sequence select="scm:check-wildcard-match($containing-type, 
                                  $wildcard, .)"/>
            <xsl:next-iteration>
                <xsl:with-param name="state" 
                                select="$states[@nr = 
                                        $matching-wildcard-edge/@to]"/>
            </xsl:next-iteration>
        </xsl:when>
        <xsl:otherwise>
            <xsl:variable name="decl" 
                          select="scm:get($matching-edge/@term)" 
                          as="element(scm:element)"/>
            <xsl:apply-templates select="." mode="explicit-decl">
                <xsl:with-param name="decl" select="$decl"/>
            </xsl:apply-templates>
            <xsl:next-iteration>
                <xsl:with-param name="state" 
                                select="$states[@nr = 
                                        $matching-edge/@to]"/>
            </xsl:next-iteration>
        </xsl:otherwise>
    </xsl:choose>            
</xsl:iterate>

The way this code works is as follows:

The actual logic is more complex than this. Firstly, we use a finite state machine with counters, to reduce the size of the finite state machine needed for a grammar such as <element name="book" minOccurs="100" maxOccurs="200"/>. Secondly, XSD 1.1 allows "open content" which allows elements matching a given wildcard to appear either (a) anywhere (interleaved content), or (b) at the end of the sequence (suffix content). The possibility of open content is not integrated into the finite state machine, but is instead handled by the validator as it arises. However, the basic principle is retained of stepping through the children using xsl:iterate to maintain the current state.