In XML validation, there are two primary approaches to ensure the accuracy and integrity of content:
Structural validation
Schematron
Structural validation (e.g., RELAX NG, XSD) handles the overall structure of an XML document, ensuring that elements appear in the correct order, adhere to the proper data types, and follow hierarchical relationships between elements. For instance, when a bibliography
element is present, the following-sibling
must be as per the following: expected the element end-tag or element "bibliography", "glossary", "index" or "toc".
On the other hand, Schematron enforces a more granular, rule-based validation that can be precisely tailored to business requirements. For instance, for accessibility purposes, when figure
elements are present, alt
elements must be included as a child of the mediaobject
element, or the file will raise error messages.
Example 1. DocBook Schematron Invalid Markup
<figure @xml:id="b-figure1"> <info> <title>Image</title> </info> <mediaobject> <imageobject> <imagedata fileref="images/f001.jpg" format="image/jpeg"/> </imageobject> </mediaobject> </figure>
Validation Error: figure with @fileref value of "images/f001.jpg" must contain correct markup for descriptive text to ensure accessibility. Check element with id "b-figure1".
Both structural validation and Schematron serve different but complementary roles; combining them ensures that the XML content adheres to a required structural format and meets the customised requirements. This multi-validation approach significantly improves the reliability and validity of content throughout its lifecycle, ensuring that the high standards required in publishing are met. Additionally, we use the parsing results from industry standard validators for each flavour of schema, where any such errors should be reported by them as per usual.