Transform XML to TeX

The xml2tex configuration vocabulary is specified by an RelaxNG schema[24]. Here is a minimal example configured for DocBook:

&lt;?xml version="1.0" encoding="UTF‑8"?&gt;<br/>
  &lt;set xmlns="<link xlink:href="http://transpect.io/xml2tex">http://transpect.io/xml2tex</link>" xmlns:xsl="<link xlink:href="http://www.w3.org/1999/XSL/Transform">http://www.w3.org/1999/XSL/Transform</link>"&gt;
  &lt;import href="../docx2tex/conf/conf.xml"/&gt;
  &lt;ns prefix="dbk" uri="<link xlink:href="http://docbook.org/ns/docbook">http://docbook.org/ns/docbook</link>"/&gt;
  &lt;preamble&gt;
    \documentclass{article}
    \usepackage[utf8]{inputenc}
    \usepackage[main=english,greek]{babel}
  &lt;/preamble&gt;
  &lt;front&gt;
    \title{&lt;xsl:value‐of select="/dbk:article/dbk:info/dbk:title"/&gt;}
    \maketitle
  &lt;/front&gt;
  &lt;back&gt;
    \printindex
  &lt;/back&gt;
  &lt;template context="dbk:chapter"&gt;
    &lt;rule name="chapter" type="env"&gt;
      &lt;param select="dbk:title"/&gt;
      &lt;text select="* except dbk:title"/&gt;
    &lt;/rule&gt;
  &lt;/template&gt;
  &lt;regex regex="([&amp;#x370;-&amp;#x3ff;]{2,})+"&gt;
    &lt;rule name="foreignlanguage" type="cmd"&gt;
      &lt;param&gt;greek&lt;/param&gt;
      &lt;param regex‐group="1"/&gt;
    &lt;/rule&gt;
  &lt;/regex&gt;
  &lt;charmap&gt;
    &lt;char character="&amp;#x3c9;" string="${\omega}$"
          context="*[@css:font‐style eq 'italic']"/&gt;
  &lt;/charmap&gt;
&lt;/set&gt;

The xml2tex configuration contains eight top‐level elements. The first five specify the basic document structure:

The actual mapping between XML nodes and text is performed by these three templates:

xml2tex cannot deny its XSLT origins, so let us take a closer look at the anatomy of the chapter template. Internally, it will be converted to an XSLT template matching on “dbk:chapter”. But in contrast to XSLT templates, its contents are more restricted. Based on is @type attribute, the <rule/> element inserts either a TeX command (\chapter) or environment (\begin{chapter}…\end{chapter}). Then you can specify whether the TeX instruction is followed by a number of arguments (<param/>). options (<option/>) or regular text (<text/>). You can insert static text or specify an XPath with the @select attribute.

&lt;template context="dbk:chapter"&gt;
  &lt;rule name="chapter" type="env"&gt;
    &lt;param select="dbk:title"/&gt;
    &lt;text select="* except dbk:title"/&gt;
  &lt;/rule&gt;
&lt;/template&gt;

With this mechanism, you can take arbitrary XML and configure your preferred TeX output. Compared to pure XSLT, this is a more declarative approach which resembles the structure of a TeX document. In contrast to xmltex, you have more flexible configuration options and XPath as a powerful query language. Furthermore, it’s also possible to insert XSLT code within the xml2tex configuration, for example if you want to encapsulate the evaluation of the TeX list type.

xml2tex has proven itself at le‑tex in productive use for different customers with different XML vocabularies and TeX requirements. Even though le‑tex is also using xmltex for some workflows, our TeX developers are more convinced of the xml2tex approach, in particular because it relieves them of configuring xmltex.



[24] Transpect (2023) xml2tex RelaxNG schema. Available at: https://github.com/transpect/xml2tex/blob/master/schema/xml2tex.rng (Accessed: 30 May, 2023)