Running the emblem2rdf pipeline

Having solved the problems with the structural metadata documents and their transformation to Emblem Schema documents we decided to base our RDF/XML on the Emblem Schema documents. This idea was driven by the plan to consolidate the structural metadata documents in the near future by removing the emblem related information and storing them as documents in their own right.

To run the final pipeline we harvest all emblem book descriptions from the OAI-PMH web service. The pipeline then iterates every emblem book description and validates it against the revised Emblem Schema. The document is skipped with a warning message if validation fails. Otherwise an XSL transformation converts every emblem from its Emblem Schema representation to RDF/XML conforming to our emblem ontology. The bibliographic description is transformed as well and added to every emblem contained in a particular emblem book. The resulting RDF/XML document is validated with a RelaxNG grammar and serialized to disk. Because we publish the emblem descriptions as static files on a web server we apply a transformation to HTML and serialize it to disk.

Due to the lack of native support for RDF serialization in XProc we use the rapper command line tool to convert the resulting RDF/XML documents to NTriples, and TopBraid SHACL, an open source implementation of the W3C Shapes Constraint Language (SHACL) based on Apache Jena, to validate the RDF graph.