Java has a class java.util.HashMap
(which Saxon uses extensively). C# does not have
a class with this name. It does have a rather similar class System.Dictionary
, but
there are differences in behavior.
Broadly speaking, there are three ways we deal with dependencies:
Rewriting. Here the converter (specifically, the XML-to-C# transformation stylesheet)
has logic to rename references to the class java.util.HashMap
so they instead refer to
System.Collections.Generic.Dictionary
, and to convert calls on the methods of java.util.HashMap
so they instead call the corresponding methods of System.Dictionary
. We've already
seen an example of this above. Sometimes
there is no direct equivalent for a particular method, in which case we instead
generate a call on a helper method that emulates the required functionality.
(System.Collections.Generic.Dictionary
, for example, has no direct equivalent to the get()
method
on java.util.HashMap
, largely because it cannot use null
as a return value
when the required key is absent.)
The converter uses rewriting for the vast majority of calls on commonly used classes and methods. There's more detail on how this is done below.
Emulation. Here we implement (in C#) a class that emulates the behaviour of the
Java class – or at least, those parts of the behaviour that Saxon relies on. An
example where we do this is java.util.Properties
, which has no direct equivalent
in C#, but which is easily implemented using dictionaries. Saxon doesn't use the
complicated methods for importing and exporting Properties
objects, so we don't
need to emulate those.
Avoidance. Here we simply eliminate the dependency. For example, the Java
product will accept input from either a push (SAX) or pull (StAX) parser. On
C# we will only support a single XML parser, the one from Microsoft. This is a
pull parser, so we eliminate all the Saxon code that's specific to SAX support.
This is non-trivial, of course, because the relevant code is widely scattered
around the product. But once found, it's usually easy to get rid of it using
preprocessor directives in the Java (//#if CSHARP==false
). I should perhaps have
mentioned that there's a "phase 0" in our conversion pipeline, which is to apply
these preprocessor directives.
In cases where dependencies are handled by rewriting, there are two parts to this. Firstly, we have a simple mapping of class names. This includes both system classes and Saxon-specific classes. Here are a few of them:
<xsl:variable name="specialTypes" as="map(xs:string, xs:string)" select="map{ 'boolean': 'bool', 'java.io.BufferedInputStream': 'System.IO.BufferedStream', 'java.io.BufferedOutputStream': 'System.IO.BufferedStream', 'java.io.BufferedReader': 'Saxon.Impl.Helpers.BufferedReader', 'java.lang.ArithmeticException': 'System.ArithmeticException', 'java.lang.ArrayIndexOutOfBoundsException': 'System.IndexOutOfRangeException', 'java.lang.Boolean': 'System.Boolean', 'java.lang.Byte': 'System.Byte', ... 'java.math.BigDecimal': 'Singulink.Numerics.BigDecimal', ... 'java.util.ArrayList': 'System.Collections.Generic.List', 'java.util.Collection': 'System.Collections.Generic.ICollection', 'java.util.Comparator': 'System.Collections.Generic.Comparer', ... 'net.sf.saxon.ma.trie.ImmutableHashTrieMap': 'System.Collections.Immutable.ImmutableDictionary', 'net.sf.saxon.ma.trie.ImmutableMap': 'System.Collections.Immutable.ImmutableDictionary', 'net.sf.saxon.ma.trie.ImmutableList': 'System.Collections.Immutable.ImmutableList', 'net.sf.saxon.ma.trie.TrieKVP': 'System.Collections.Generic.KeyValuePair', ... 'net.sf.saxon.s9api.Message': 'Saxon.Api.Message', 'net.sf.saxon.s9api.QName': 'Saxon.Api.QName', 'net.sf.saxon.s9api.SequenceType': 'Saxon.Api.XdmSequenceType', ... }"/>
Note that there are cases where we replace system classes with Saxon-supplied classes, and there
are also cases where we do the reverse: the extract above illustrates that we can replace Saxon's
immutable map implementation with the standard immutable map in .NET. In the case of BigDecimal
,
we rewrite the code to use a third-party
library[10]
with similar functionality to the built-in Java class.
The other part of the rewrite process is to handle method calls. We rely here
on knowing the target class of the method, and we typically handle the rewrite
with a template rule like this (long namespace names abbreviated for space reasons: S.N
=
Singulink.Numerics
, S.I.H
= Saxon.Impl.Helpers
)
<xsl:template match="*[@RESOLVED_TYPE = 'java.math.BigDecimal']" priority="20" mode="methods"> <xsl:sequence select="f:applyFormat(., map{ 'add#1': '(%scope%+%args%)', 'subtract#1': '(%scope%-%args%)', 'multiply#1': '(%scope%*%args%)', 'divide#1': 'S.N.BigDecimal.Divide(%scope%, %args%, 18)', 'divide#2': 'S.N.BigDecimal.Divide(%scope%, %args%)', 'divide#3': 'S.N.BigDecimal.Divide(%scope%, %args%)', 'negate#0': '-%scope%', 'mod#1': 'S.I.H.BigDecimalUtils.Mod(%scope%, %args%)', 'signum#0': '%scope%.Sign', 'remainder#1': 'S.I.H.BigDecimalUtils.Remainder(%scope%, %args%)', 'divideToIntegralValue#1': 'S.I.H.BigDecimalUtils.Idiv(%scope%, %args%)', 'divideAndRemainder#1': 'S.I.H.BigDecimalUtils.DivideAndRemainder(%scope%, %args%)', 'valueOf#1': 'Saxon.Impl.Helpers.BigDecimalUtils.ValueOf(%args%)', 'intValue#0': '((int)%scope%)', 'longValue#0': '((long)%scope%)', 'doubleValue#0': '((double)%scope%)', 'floatValue#0':'((float)%scope%)', 'longValueExact#0': 'S.I.H.BigIntegerUtils.LongValueExact(%scope%)', 'setScale#1': '%scope%', (:no-op, values are normalized:) 'setScale#2': '%scope%', (:no-op, values are normalized:) 'stripTrailingZeros#0': '%scope%', (:no-op, values are normalized:) 'toBigInteger#0': '((System.Numerics.BigInteger)%scope%)', '*': '%scope%.%Name%(%args%)' })"/> </xsl:template>
This is a template rule in mode methods
, a mode that is only used
to process MethodCall
expressions, so we don't need to repeat this in the
match pattern. This particular rule handles all calls where the target
class is java.math.BigDecimal
. It delegates the processing to a function
f:applyFormat()
which is given as input a set of sub-rules supplied as a
map in a custom microsyntax. Given the name and arity of the method call,
this function looks up the applicable sub-rule, and interprets it: for
example value1.add(value2)
translates to (value1+value2)
(C# allows user-defined
overloading of operators such as "+"). Some methods such as mod()
are converted
into calls on a static helper method (written in C#) in class
Saxon.Impl.Helpers.BigDecimalUtils
.
Most of the product's dependencies have proved easy to tackle using one or more of these mechanisms. We were able to use rewriting more often than I expected – for example it's used to replace the dependency on Java's BigDecimal class with a third-party library, Singulink.Numerics.BigDecimal. It's worth showing the XSLT code that drives this: