Serializing the parse tree

In the real transpiler, the final stage of processing is to take each of the XML (now JSON) documents representing the parse tree of a module, and, with the aid of information in the digest file, to generate corresponding C# code. This combines two tasks: handling any differences between Java and C#, and then serializing the result (with sufficient indentation and spacing to make it legible, since we're going to need to debug it).

For the sake of the case study, I decided to skip the business logic of Java to C# conversion, and simply re-serialize the parse tree as Java code. This mirrored the development approach I had used for the transpiler, where I first wrote template rules to convert the parse tree back to Java, and then incrementally modified the XSLT to handle cases where the C# needed to be different.

I didn't attempt to rewrite all the template rules, but converted a sufficient subset that several of the larger Java modules could be successfully processed. I felt this would give us all the feedback we needed on whether the task was feasible.

A typical (but very simple) template rule in the transpiler might look like this:

<xsl:template match="*[@nodeType='ReturnStmt']">
    <xsl:call-template name="indent"/>
    <xsl:text>return </xsl:text>
    <xsl:apply-templates select="*"/>
    <xsl:text>;{$NL}</xsl:text>
</xsl:template>      
      

This rule processes an expression with @nodeType='ReturnStmt' and outputs the (Java or C#) text "return XXX;" with suitable indentation, and followed by a newline. The XXX here is constructed by recursive application of template rules to the single operand of the return statement (if any): select="*" selects the operand, whatever it might be, and processes it using its own template rule.

The rule doesn't need much changing to handle JSON instead of XML. It becomes:

<xsl:template match=".[?_nodeType='ReturnStmt']">
    <xsl:call-template name="indent"/>
    <xsl:text>return </xsl:text>
    <xsl:apply-templates select="?expression"/>
    <xsl:text>;{$NL}</xsl:text>
</xsl:template>   
      

Some observations:

It turns out to be rather convenient that we can define the match patterns of template rules based on the properties of a map in the JSON, rather than on the associated key. If instead of "right":{"_nodeType":"NullLiteralExpr"} we had to cope with "NullLiteralExpr":{"_role":"right"} (a design that could equally well have been chosen), then the matching would become rather more complex, as we shall see.

While most of the template rules in this stylesheet match on the value of the nodeType attribute, this isn't true of all of them.

The conclusion from this exercise was that the conversion to handle JSON rather than XML input was straightforward — but that we had been lucky. The template rules all matched on attribute values rather than element names; and none of them made use of features such as XML node identity, or access to parents, ancestors, or siblings, that would be difficult to replicate in the JSON world.

Also: I've glossed over the fact that in this phase, I was merely looking at the code that serializes the parse tree back to Java, and skipped the “business logic” that does the conversion from Java to C#. That code, from a fairly superficial examination, includes a few things that are rather harder to deal with: