XSLT 3.0, together with its accompanying specifications such as XPath 3.1, introduced support for processing and generating JSON alongside XML. The new features have proved useful, but they have known limitations.
Saxonica, for example, uses the JSON capabilities in XSLT 3.0 when processing
new online orders from customers for the Saxon product. We use a third-party service,
Ecwid, that supplies details of new orders in JSON format, and we use an XSLT application
to process this order, add the details to our XML orders database, and generate license
keys and email notifications to the customer. The application uses XForms and SaxonJS. It
pulls the JSON information from Ecwid using a call on fn:json-doc
with
an appropriate URI, and then extracts the required data using path expressions such as
<xsl:variable name="subscriptionOption" select="$items?items?1?recurringChargeSettings" as="map(*)?" />
The JSON structure is straightforward, and the features in XSLT 3.0 and XPath 3.1 are more than adequate to handle it. [3]
But the capabilities of XSLT for processing JSON are more limited than the capabilities for processing XML. One of these limitations, the one addressed in this paper, is the ability to transform JSON using XSLT's quintessential processing model: rule-based recursive-descent transformation using template rules.
A project is currently underway, informally known as QT4, to define 4.0 versions
of the XSLT, XPath, and XQuery languages. This project has been set up as a W3C Community Group
and meets weekly to discuss and agree proposed changes to the specification. The activity
can be tracked online at https://qt4cg.org/
and of course anyone is welcome
to participate. To date over 500 changes to the specifications have been accepted, and
most of these have been implemented in Saxon and/or BaseX; more than 12,000 test cases
have been added to the XQuery test suite alone, on top of the 32,000 test cases already
available for the 3.1 specifications[4].
But I was concerned that we hadn't really tackled or solved the issues concerned with recursive-descent transformation. Back in 2016, before XSLT 3.0 was even finalised, I published a paper [Kay 2016] at XML Prague in 2016 giving a couple of worked examples of JSON transformations using XSLT 3.0, coming to the rather unhappy conclusion that they were best tackled by converting the JSON to XML, transforming the XML, and then converting the XML back to JSON. I returned to these examples in a Balisage paper in 2022 [Kay 2022] where I showed that these two particular problems could be tackled much more easily using new features proposed for XSLT 4.0; however I remained uneasy that neither of the two problems really featured the recursive-descent processing paradigm.
So I resolved to conduct a case-study in which I would select a realistic application
in which recursive-descent rule-based transformation of JSON input was a requirement, and
use this application to test the usability of the XSLT 4.0 specifications in their current
state, and propose enhancements where they were found to be necessary. This paper summarises
the conclusions of that study. A blow-by-blow account containing contemporaneous notes of
the tasks undertaken can be found at https://github.com/qt4cg/qtspecs/issues/1786
;
this paper focuses more on the final conclusions, and ignores some of the avenues I followed
that produced no useful insights.
[3] More details of this application can be found at [Delpratt and Lockett 2017]. At the time of that paper the Ecwid data feed was plain text rather than JSON, but the paper does describe some other ways in which the application uses JSON internally.
[4] Data obtained, naturally, using Saxon and XQuery.