A SNAT-based code generator is driven by metadata enabling primitive operations, which serve as the functional building blocks of a document transformation. These operations map a node sequence to a value which is another node sequence (context-propagator), an array of node sequences (context-distributor) or a string (context-atomizer). Does the central role played by node sequences mean that the SNAT model is restricted to XML data sources?
The apparent limitation can be overcome if we generalize the notion of a “node sequence” to mean a sequence of distinct items. The following table gives a few examples.
Table 8. Examples for a possible generalization of the node concept, meaning distinct items of which the resource is composed.
Source data media type | What is a "node"? |
---|---|
HTML | DOM node |
JSON | JSON item (object, array, string, number, boolean) |
CSV | table, row, cell |
SQL | db, table, row, column |
RDF | RDF node |
For each media type based on distinct items of information a SNAT-based code generator may be defined, following a scheme of actions discussed below. Note, however, that with some media types (e.g. HTML, JSON, CSV) a simpler approach to transformation is to use an XML representation of the original input, readily obtained by Open Source products (like BaseX ([2]) with its extension functions for parsing non-XML resources into XML). Such a trivial preprocessing step enables the reuse of code generators for XML-to-XML transformation (like the one described in the section called “Code generator SNAT”) for non-XML data sources.