The following table describes the most important metadata attributes in terms of a name and semantics. It also indicates the scope of the attributes, which is the set of location nodes where this attribute may appear. The scope is defined by constraints imposed on the location properties. The @src item, for example, appears only on locations with simple content, and @for-each, @group-by and @sort-by appear only on locations with a maximum cardinality greater 1. By default, all attribute values are interpreted as XQuery expressions. Exceptions are enabled by special syntax rules not described here.
Table 4. The metadata model of a SNAT (Source Navigation Annotated Target tree). Each kind of metadata is modeled in terms of an attribute name, value semantics and attribute scope, which restricts the use of this attribute to locations with certain properties. The column Gen? indicates whether the attribute is generated as part of the initial SNAT tree generated from the target XSD (Y) or added to the generated SNAT tree by hand if required (N).
Attribute name | Location properties | Meaning | Example | Gen? |
---|---|---|---|---|
alt | Location of an element or attribute which is optional | XPath expression, evaluated if @src or @ctxt yields the empty sequence; the value of @alt is used as if yielded by @src or @ctxt, respectively | "#UNKNOWN" | Y |
atom | Location of a simple content element or attribute | XPath expression, evaluated in a context binding variable $v to the value of an accompanying @src; if specified, the value of @atom, rather than the concatenated string values of @src, is used as element content or attribute value | substring($v, 1, 3) | N |
case | Child of a choice group descriptor
(z:_choice_ ) | XPath expression; the selected choice branch is the
first child of z:_choice_ whose @case has a true
effective Boolean value | @Success eq "false" | Y |
ctxt | Location of a complex element with maximum cardinality equal 1 | XPath expression; its value is used as the propagated source context | PubInfo/Address | Y |
dflt | Location of a simple content element or attribute which is mandatory | XPath expression, evaluated if @src yields the empty sequence; the value of @default is used as if yielded by @src | "?" | Y |
for-each | Location of an element with maximum cardinality >1 | XPath expression; its value is used as the propagated source context; if not accompanied by @group-by, one target instance per value item is constructed, otherwise one target instance per group of value items | books/book | Y |
group-by | Location of an element with maximum cardinality > 1 | XPath expression, evaluated in the context of each value item yielded by @for-each; for each group of items with equal @group-by, one target instance is instantiated | ../author/surName | N |
if | Location of a complex element which is optional | XPath expression, evaluated in the context of the value of an accompanying @ctxt; if specified, the element is only instantiated if the effective Boolean value of @if is true | .//(au, ed, py) | N |
sort-by, sort-by2, sort-by3 | Location of a complex element with maximum cardinality > 1 | XPath expressions, optionally following by the string “DESCENDING”, used to control the order of value items received from @for-each | author/surName | N |
src | Location of a simple content element or attribute | XPath expression; selects the nodes providing the information used as element content or attribute value | author/surName | Y |
An example SNAT and the transformer code generated from it are shown by the following two listings.
Example 2. SNAT document, defining the transformation described as an introductory example.
<z:snats xmlns:z="http://www.xsdplus.org/ns/structure"> <z:prolog/> <z:snat> <publications ctxt="books"> <z:_attributes_> <updatedAt src="@lastUpdate" alt=""/> </z:_attributes_> <publication for-each="book"> <z:_attributes_> <publicationYear src="py" alt=""/> </z:_attributes_> <isbn src="@isbn" dflt="'#MISSING'"/> <title src="title" dflt=""/> <creator for-each="author"> <creatorRole src="'Author'" dflt=""/> <creatorName src="." dflt=""/> </creator> </publication> </publications> </z:snat> </z:snats>
Example 3. XQuery code, generated from the SNAT tree shown in the preceding listing.
let $c := * let $c := $c/books return <publications>{ let $v := $c/@lastUpdate return if (empty($v)) then () else attribute updatedAt {$v}, for $c in $c/book return <publication>{ let $v := $c/py return if (empty($v)) then () else attribute publicationYear {$v/string()}, <isbn>{ let $v := $c/@isbn return if (exists($v)) then $v/string() else '#MISSING' }</isbn>, <title>{$c/title/string()}</title>, for $c in $c/author return <creator>{ <creatorRole>{'Author'}</creatorRole>, <creatorName>{$c/string()}</creatorName> }</creator> }</publication> }</publications>
The code may be viewed as assembled from the primitive operations reflecting the metadata values. All data values, for example, are obtained by evaluating the expressions found in attributes @src, @alt and @dflt. Similarly, the propagation and distribution of the source context is guided by attributes @ctxt and @for-each. The following listing summarizes the rules how to derive the implementations of primitive operations from metadata item values.
Table 5. The primitive operations context-propagator, context-distributor and context-atomizer as implied by metadata values. Notation: v@foo is the value of the expression supplied by attribute @foo, "if v@foo" means "if the value is not the empty sequence, “if @foo” means “if attribute @foo exists”. The context-distributor is described informally by equating subsets of the propagated context (given by v@for-each or v@ctxt) with the source context (SC) of a distinct target item (TI).
Context-propagator | ||
if @for-each: | v@for-each | |
if (@ctxt and @alt): | if (v@ctxt) then v@ctxt else v@alt | |
if @ctxt: | v@ctxt | |
if (@src and @alt): | if (v@src) then v@src else v@alt | |
else: | v@src | |
Context-distributor | ||
if (@for-each and @group-by): | each group of items in v@for-each: | SC of one distinct TI |
if @for-each: | each item in v@for-each: | SC of one distinct TI |
else: | all items in v@ctxt: | SC of the only TI |
Context-atomizer | ||
if (@atom): | v@atom | |
else: | string-join($source-context, " ") |
By now we have decomposed document transformation into three primitive operations, and we have set up a model how to derive their implementation from a small set of metadata. In principle, the expressiveness of this metadata language is sufficient for describing arbitrary transformation. However, the benefits of the approach get quickly lost if the expressions supplied as metadata become very complex, or if the decomposition into independent expressions entails blunt repetition of non-trivial expressions. Therefore we extend the model by a few advanced features addressing these issues.