The transformation from TEI to Emblem Schema failed because the structural metadata document was declared to
be a TEI document by a DOCTYPE-declaration but the TEI namespace was not declared in the document. Instead the
DTD declared an xmlns
attribute with a default value of http://www.tei-c.org/ns/1.0
. Thus in order for the namespace binding to be present the
XML processor had to process the DTD when loading the document. The reason why the transformation failed was
simply that XProc used the DTD to supply the xmlns
attribute with its default
value while the original transformation was initiated in PHP which disables DTD processing by default. The
transformation script thus expected the elements of the structural metadata document to be in the null
namespace, while they were placed in the TEI namespace in our pipeline.
To allow for consistent namespace-aware processing we relocated the elements to the
TEI namespace by applying an appropriate XSL transformation and modified the existing
transformations accordingly. We also removed the DOCTYPE-declaration and changed the
inclusion mechanism of facsimile.xml
from external entity references
to XInclude. The latter prompted
us to add a namespace declaration to facsimile.xml
, which was
missing. What we did not realize back then (and still have to do) is that we need to
change references to the tei:graphic
elements due to XInclude's
base URI fixup. Referencing attributes like facs
are typed as
xsd:anyURI
and thus need to reference the originating file and not just
document local xml:id
s.
We furthermore noticed that all structural metadata documents defined an xml:base
attribute on the outermost element. The content of this
attribute varied. Some documents used the persistent URL of the digital object, some used
a relative URI reference denoting the object identifier with, and some without trailing
slash. It is unclear, why xml:base
was used in the first
place and what caused the variations in its content. Effectively the attribute was used as
if it holds the digital object identifier which happens to be the relative path to the
object in most cases.
Although the use of xml:base
caused no apparent error we removed it to avoid
problems in the future.