Consumption: Understanding XML structures

If we in our line of work encounter an XML structure we don't know yet, finding some description is usually just a web-search away. Google, Stackoverflow and the likes will often come up with some example, solution or description to help us. This is what I call "cookbook" level understanding: You don't really know what you're doing but you found a recipe and, hey, it works (or not).

But what if you really need to understand the format and its implications? For instance because it's a programming language you need to master. Or the application it serves is important to you. You start digging and hopefully find some kind of in-depth explanation of the structure. Do this more than once and you'll find there is no uniform way of documenting such a thing. Let's have a look at a few:

XSLT

A lot of people in the XML community know Michael Kay's XSLT book (Michael Kay; XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th edition; Wiley Publishing). Since an XSLT program is an XML structure, the book needs to explain this. Here's an excerpt from the explanation of the <xsl:copy-of>:

Figure 1. An excerpt from the documentation of XSLT

An excerpt from the documentation of XSLT

All XML structure descriptions look this way, with as most important parts: short introduction, a standardized formatted impression of what it looks like, a table explaining the attributes and a description of its contents.

Maven POM

The documentation of Maven POM (Project Object Model) files can be found on the Apache Maven POM site (https://maven.apache.org/pom.html). Here's an example:

Figure 2. An excerpt from the documentation of Maven POM files

An excerpt from the documentation of Maven POM files

This follows a different approach: Short introduction, example XML fragment, explanatory text with a lot of bullets.

Ant

Another example we're probably all familiar with is Ant. Ant's documentation can be found on the Apache Ant site (https://ant.apache.org/manual/). Building blocks of Ant are tasks. Here's an example of a task description:

Figure 3. An excerpt from the documentation of Ant

An excerpt from the documentation of Ant

The documentation follows a consistent pattern: Description, parameters (which are attributes), description of nested elements (for brevity reasons not shown) and examples.

HL7 CDA

Here's an example of something only a few will be familiar with: HL7 CDA (health Level 7, Clinical Document Architecture), a standard for the exchange of healthcare information. In a book that tries to explain this and more (Principles of Health Interoperability, SNOMED CT, HL7 and FHIR; Tim Benson, Grahame Grieve; Springer-Verlag), I found the following way of explaining the CDA XML structure:

Figure 4. An excerpt from the documentation of HL7 CDA

An excerpt from the documentation of HL7 CDA

They use a tree diagram to outline the structure followed by a short explanation followed by (badly formatted…) examples of XML structures and more text to explain these.

eXist-db conf.xml

A last example comes from the documentation of eXist-db's conf.xml file. This is the main configuration file for the database.

Figure 5. An excerpt from the documentation of eXist-db's conf.xml

An excerpt from the documentation of eXist-db's conf.xml

This file has no separate documentation page or document. Instead everything is documented using XML comments inside the file itself.

These are just a few examples and they illustrate there are very different of documentation formats out there. Now everybody probably has his/her preferences and not every format needs the same kind or level of documentation, so can we say something in general about it? Let's try: