xmltex

xmltex is a simple, non‐validating XML parser implemented in TeX by David Carlisle that allows to use LaTeX's typesetting capabilities not just for LaTeX documents but for XML documents as well. xmltex can associate TeX code with XML elements, attributes, processing instructions, and entities as well. However, xmltex can neither validate the XML document with a DTD nor resolve external DTD entities but is able to process local entity declarations.[7]

Because xmltex is written in TeX, you need TeX to invoke xmltex with LaTeX. Here is an example that loads doc.xml with xmltex from a TeX document.

\def\xmlfile{doc.xml} % xml file
\input xmltex.tex % load xmltex

The parser supports XML namespaces to some extent and can be configured for arbitrary XML, for example a TEI document that contains MathML. For this purpose, namespaces can be defined within a separate macro file, usually with an xmt extension[8]. For each XML document type, a separate xmt file is required. Whenever xmltex processes an XML element with a particular namespace, it loads the corresponding xmt file. The following command can be added to a catalogue configuration (cfg extension) to associate a namespace with a particular xmt file:

\NAMESPACE{URL}{xmt‐file}

The xmltex catalogue configuration allows to associate XML contexts with TeX instructions. For example, you can specify with

\XMLelement{element‐qname}{attribute‐spec} {begin‐code}{end‐code}

a TeX command for an element. Whenever xmltex encounters the element with this name, the begin and end code are inserted. A minimal xmltex catalogue configuration for TEI is shown in the code listing below.

\DeclareNamespace{tei}{http://www.tei-c.org/ns/1.0}
\XMLelement{tei:TEI}
{}
  {\documentclass{article}
     \begin{document}
  }
  {\end{document}}
\XMLelement{tei:teiHeader}
{}
  {}{}
\XMLelement{tei:title}
{}
  {\xmlgrab}
  {\title{#1}
     \maketitle}
\XMLelement{tei:p}
{}
{\par}
  {\par}

An xmltex project would usually include a TeX file, a configuration, xmt files for each namespace, and the input XML file. Below is what a typical xmltex project directory structure would look like:

MyProject/
  |--main.tex 
  |--main.cfg (xmltex‐configuration)
  |--doc.xml (XML input)
  |--tei.xmt (xmltex‐mapping for TEI)

Figure 1. xmltex inputs and outputs

xmltex inputs and outputs

To configure xmltex for an XML schema like TEI, you need to know the nesting depth of a <head/> element to determine whether it is mapped to \chapter{} or \section{}. The xmltex syntax only provides TeX statements that can be mapped to specific XML node names, but does not take into account their actual position in the XML. To address this issue with xmltex, you need to introduce a counter that counts the number of ancestor <div> elements:

\newcount\div@counter \div@counter=0
\XMLelement{tei:head}
  {}
  {\xmlgrab}
  {%
\ifnum\div@counter=1\relax
  \chapter{#1}%
\else
  \ifnum\div@counter=2\relax
    \section{#1}%
  \fi
\fi
}
\XMLelement{tei:dvi}
  {}
  {\global\advance\div@counter1}
  {\global\advance\div@counter‑1}

xmltex does not offer a powerful query language like XPath, on the contrary, you can only associate plain element and attribute names to TeX instructions. More complex context queries represent a programming task that can prove very daunting, given TeX’s macro‐expanding processing model. The code is less declarative and hard to maintain.

If this method is getting too complicated, xmtex allows also to modify the output by placing TeX commands directly into the XML source. The xmltex documentation suggests to use elements either with the xmltex namespace or a custom namespace to inject TeX instructions[9]. Another mechanism is provided by using xmltex processing instructions:

<?xmltex TeX commands ?>

Furthermore, error reporting is virtually nonexistent and various constraints on XML are not enforced. For example, you can configure element names with characters that are not allowed in XML. Another problem is that xmltex isn’t actively maintained. The code was last modified in 2000 and moved to GitHub in 2012 without any notable changes except the initial commit[10].

Within a TeX environment, xmltex can be a useful tool for XML processing. It offers a lightweight and declarative (apart from programmatic content manipulation where necessary, as mentioned above) syntax that TeX users should not be unfamiliar with. For XML users, not only the syntactical differences might prove to be a stumbling block .



[7] David Carlisle (2000) xmltex: A non validating (and not 100% conforming) namespace aware XML parser implemented in TeX. Available at https://ctan.space-pro.be/tex-archive/macros/xmltex/base/manual.html (Accessed: May 24, 2023)

[8] The null namespace, the XML namespace (http://www.w3.org/1998/xml) and the xmltex namespace (http://www.dcarlisle.demon.co.uk/xmltex) are predeclared.

[9] David Carlisle (2000) Xml2tex. Accessing TeX. Available at https://ftp.agdsn.de/pub/mirrors/latex/dante/macros/xmltex/base/manual.html#manualN1059 (Accessed May 30, 2023)

[10] David Carlisle (2012) xm2tex source code on GitHub. Available at https://github.com/davidcarlisle/dpctex/tree/main/xmltex (Accessed: May 30, 2023)