As discussed in the previous section, document construction can be regarded as a sequence of basic decisions concerning the questions how many instances to construct and which character sequence to use (as simple element content or attribute value). Consider, for instance, the <creator> element. As its minimum and maximum cardinality differ (being 0 and unbounded), we must take a decision how many elements we will actually instantiate. However, as there may be more than one <publication> to be constructed, the answer clearly depends on which <publication> is considered – the question “how many instances?” cannot be answered globally, but only locally, concerning <creator> children of a particular <publication> element. In fact, we must ask and answer the question again and again for each <publication> appearing in the target document. Similarly, the string value to be used for @publicationYear must be determined repeatedly, once for each <publication> element in the transformation result.
To generalize, the basic decisions concerning the instantiation of a model node are taken in the context of an individual parent node. To put it differently: if the parent node has been identified, the decisions can be made, which means one subset of the model instances can be constructed, comprising all instances found under the given parent. It is important to remember that the complete set of instances of a given model node is but the union of the subsets obtained for the instances of the parent model node. This enables a decomposition of the highly complex operation of document construction into primitive operations - small building blocks of processing which implement the basic decisions. Defining a local instantiation of a model node as the subset of its instances sharing a particular parent node, we obtain the following picture:
Document construction = instantiation of a document model
Document model = a set of model nodes
Instantiation of a model node = a set of local instantiations of a model node
Local instantiation of a model node = result of a composite operation “local construction”
Local construction = composition of primitive operations
Primitive operation = implementation of a basic decision
This picture amounts to a decomposition of document construction into primitive operations. However, it does not give any hints how to implement those operations. Its practical value depends on an extension enabling their implementation. In particular, if the implementation can be derived from model and configuration data, code generation becomes feasible.
Focusing on document construction by transformation of a source document, we can translate the patterns observed in the previous section into formal concepts. Table 1 showed how the basic decisions are driven by a navigation of the source document. This navigation was described as the mapping of a context of source nodes to another set of source nodes which enables the decision. A formal definition of primitive operations is prepared by defining input, output and semantics of these mappings. Besides, having realized the importance of parent node identity, we introduce the notion of a node path as a means for expressing node identity (see Table 2, “Some concepts facilitating the decomposition of document to document transformation into primitive operations.”).
Table 2. Some concepts facilitating the decomposition of document to document transformation into primitive operations.
Name | Abbreviation | Property of … | Semantics |
---|---|---|---|
Node path | NP | Instance node | Identifies the identity of the node |
Source context | SC | Node path | Given a node path, a set of source nodes enabling the construction of the target node identified by the node path |
Propagated source context | PSCparent-instance | Location | Given a location and the node path of an instance of the parent location: a set of source nodes enabling the construction of the location instances which are child of the node identified by the node path |
Source context array | SCAparent-instance | Location | Given a location and the node path of an instance of the parent location: an array with members being sets of source nodes; for each member, one location instance is constructed, with a source context equal to the value of the member |