Metadata item model

The following table describes the most important metadata attributes in terms of a name and semantics. It also indicates the scope of the attributes, which is the set of location nodes where this attribute may appear. The scope is defined by constraints imposed on the location properties. The @src item, for example, appears only on locations with simple content, and @for-each, @group-by and @sort-by appear only on locations with a maximum cardinality greater 1. By default, all attribute values are interpreted as XQuery expressions. Exceptions are enabled by special syntax rules not described here.

Table 4. The metadata model of a SNAT (Source Navigation Annotated Target tree). Each kind of metadata is modeled in terms of an attribute name, value semantics and attribute scope, which restricts the use of this attribute to locations with certain properties. The column Gen? indicates whether the attribute is generated as part of the initial SNAT tree generated from the target XSD (Y) or added to the generated SNAT tree by hand if required (N).

Attribute nameLocation propertiesMeaningExampleGen?
altLocation of an element or attribute which is optionalXPath expression, evaluated if @src or @ctxt yields the empty sequence; the value of @alt is used as if yielded by @src or @ctxt, respectively"#UNKNOWN"Y
atomLocation of a simple content element or attributeXPath expression, evaluated in a context binding variable $v to the value of an accompanying @src; if specified, the value of @atom, rather than the concatenated string values of @src, is used as element content or attribute valuesubstring($v, 1, 3)N
caseChild of a choice group descriptor (z:_choice_)XPath expression; the selected choice branch is the first child of z:_choice_ whose @case has a true effective Boolean value@Success eq "false"Y
ctxtLocation of a complex element with maximum cardinality equal 1XPath expression; its value is used as the propagated source contextPubInfo/AddressY
dfltLocation of a simple content element or attribute which is mandatoryXPath expression, evaluated if @src yields the empty sequence; the value of @default is used as if yielded by @src"?"Y
for-eachLocation of an element with maximum cardinality >1XPath expression; its value is used as the propagated source context; if not accompanied by @group-by, one target instance per value item is constructed, otherwise one target instance per group of value itemsbooks/bookY
group-byLocation of an element with maximum cardinality > 1XPath expression, evaluated in the context of each value item yielded by @for-each; for each group of items with equal @group-by, one target instance is instantiated../author/surNameN
ifLocation of a complex element which is optionalXPath expression, evaluated in the context of the value of an accompanying @ctxt; if specified, the element is only instantiated if the effective Boolean value of @if is true.//(au, ed, py)N
sort-by, sort-by2, sort-by3 Location of a complex element with maximum cardinality > 1XPath expressions, optionally following by the string “DESCENDING”, used to control the order of value items received from @for-eachauthor/surNameN
srcLocation of a simple content element or attributeXPath expression; selects the nodes providing the information used as element content or attribute valueauthor/surNameY

An example SNAT and the transformer code generated from it are shown by the following two listings.

Example 2. SNAT document, defining the transformation described as an introductory example.

<z:snats xmlns:z="http://www.xsdplus.org/ns/structure">
  <z:prolog/>    
  <z:snat>
    <publications ctxt="books">
      <z:_attributes_>
        <updatedAt src="@lastUpdate" alt=""/>
      </z:_attributes_>
      <publication for-each="book">
        <z:_attributes_>
          <publicationYear src="py" alt=""/>
        </z:_attributes_>
        <isbn src="@isbn" dflt="'#MISSING'"/>
        <title src="title" dflt=""/>
        <creator for-each="author">
          <creatorRole src="'Author'" dflt=""/>
          <creatorName src="." dflt=""/>
        </creator>
      </publication>
    </publications>
  </z:snat>
</z:snats>                    
                

Example 3. XQuery code, generated from the SNAT tree shown in the preceding listing.

let $c := *
let $c := $c/books
return
<publications>{
   let $v := $c/@lastUpdate
   return
      if (empty($v)) then () else
      attribute updatedAt {$v},
   for $c in $c/book
   return
      <publication>{
         let $v := $c/py
         return
            if (empty($v)) then () else
            attribute publicationYear {$v/string()},
         <isbn>{
            let $v := $c/@isbn
            return
               if (exists($v)) then $v/string() else '#MISSING'
         }</isbn>,
         <title>{$c/title/string()}</title>,
         for $c in $c/author
         return
            <creator>{
               <creatorRole>{'Author'}</creatorRole>,
               <creatorName>{$c/string()}</creatorName>
            }</creator>
      }</publication>
}</publications>                    
                

The code may be viewed as assembled from the primitive operations reflecting the metadata values. All data values, for example, are obtained by evaluating the expressions found in attributes @src, @alt and @dflt. Similarly, the propagation and distribution of the source context is guided by attributes @ctxt and @for-each. The following listing summarizes the rules how to derive the implementations of primitive operations from metadata item values.

Table 5. The primitive operations context-propagator, context-distributor and context-atomizer as implied by metadata values. Notation: v@foo is the value of the expression supplied by attribute @foo, "if v@foo" means "if the value is not the empty sequence, “if @foo” means “if attribute @foo exists”. The context-distributor is described informally by equating subsets of the propagated context (given by v@for-each or v@ctxt) with the source context (SC) of a distinct target item (TI).

Context-propagator
if @for-each:v@for-each
if (@ctxt and @alt):if (v@ctxt) then v@ctxt else v@alt
if @ctxt:v@ctxt
if (@src and @alt):if (v@src) then v@src else v@alt
else:v@src
Context-distributor
if (@for-each and @group-by):each group of items in v@for-each:SC of one distinct TI
if @for-each:each item in v@for-each:SC of one distinct TI
else:all items in v@ctxt:SC of the only TI
Context-atomizer
if (@atom):v@atom
else:string-join($source-context, " ")

By now we have decomposed document transformation into three primitive operations, and we have set up a model how to derive their implementation from a small set of metadata. In principle, the expressiveness of this metadata language is sufficient for describing arbitrary transformation. However, the benefits of the approach get quickly lost if the expressions supplied as metadata become very complex, or if the decomposition into independent expressions entails blunt repetition of non-trivial expressions. Therefore we extend the model by a few advanced features addressing these issues.