The job ticket


Prev		Next

Of course, to be able to process a job ticket, we first need to acquire it. Basically this means: find out what needs to be done (for instance by inspecting the command-line arguments) and create it. In my case, it worked like this:

I created an overarching job ticket document that held tickets for several jobs, specified in some internal Domain Specific Language (DSL). Here’s a simplified example, just to give you an impression (the details and functionality do not matter here):

<jobtickets … >

  <application name="bgz" version="2017" source-project-name="zib2017-nl">
      <setup usecase="kwalificatie" directory-id="bgzkwal">
          <retrieve name="searchset.xml" url="…" directory-id="bgzkwal-instance"/>
          <copy-project-schemas>
            <include glob="*$USECASE.xsd"/>
          </copy-project-schemas>    
      </setup>
      <build>
          <stylesheet href="../xsl/…"/>
          <input-document directory="@bgzkwal-instance" name="searchset.xml"/>
          <output directory="@bgzkwal/wiki_instance" name="wiki-bgz.txt"/>
          <parameter name="adaReleaseFilename" value="zib2017bbr-decor.xml"/>
      </build>
  </application>
  
  <application … >
    …
  </application>

</jobtickets>

For a setup like this I would very strongly advocate the use of XML validation. Actually, my advice is to always create a schema for XML documents that are hand-edited, even when it’s “just” an internal data file:

When using a schema-aware editor (like oXygen), it helps you in creating and maintaining the file.
Consider adding a Schematron schema as well for the more difficult rules (like, for instance, identifier references). It’s really helpful when a minor typo in an obscure identifier get’s flagged immediately.
XProc has several validation steps. In fact, validation has become so easy, it’s almost criminal not to do it.

A next step would be to filter the main document, based on the job that needs to be done. In my case I threw everything away that didn’t belong to the application I was handling. Assume the name of this application is in a variable (or option) $application, then the code for this is a single line of XProc:

<p:delete match="/*/application[@name ne '{$application}']"/>

The next step in processing the job ticket proved crucial for the applicability of the job ticket design pattern: job ticket need to be enhanced. Enhancement here means: everything you can do or calculate up-front must be done first, before the actual processing starts.

When file or directory names are involved, calculate their full absolute values up-front. For instance, in my case, names sometimes depended on global parameters (some main directory), attribute values of the current and parent elements (subdirectories) and some global settings made elsewhere (version numbers, etc.). All this was tricky to compute. Also: absolute file and directory names must be URIs (having file:/// in front). You have to make sure this is the case before using them in XProc steps.

XProc support the full XPath 3.1 language, so, yes, in theory, you could do these kinds of computations directly in your XProc pipeline. However, this would lead to very complicated and overly long XPath expressions, especially because XPath functions (which would help, a bit) are not (yet?) supported. I would recommend doing all these up-front enhancements in XSLT, which is a language much more suited to these kinds of things.

Here is an example of a snippet of an enhanced job ticket:

<application _target-dir="file:///C:/.../lab/3.0.0"
             _source-dir="file:///C:/.../lab/3.0.0"
             name="lab"
             version="3.0.0">

  <setup _target-dir="file:///C:/.../lab/3.0.0/sturen_laboratoriumresultaten"
         _source-dir="file:///C:/.../lab/3.0.0/sturen_laboratoriumresultaten"
         usecase="sturen_laboratoriumresultaten"
         directory-id="slr">

     <copy-data _target-dir="file:///C:/.../ada_instance_repo"
                _source-dir="file:///C:/.../sturen_laboratoriumresultaten/ada_instance_repo"
                target-subdir="ada_instance_repo"
                source-subdir="ada_instance_repo"
                directory-id="ada-instance-repo">

        <include _pattern="\.xml$"
                 glob="*.xml"/>

     </copy-data>

Target and source directories are computed at every level of the document (the __target-dir and _source-dir attributes). When the job ticket processing wants to do something, based on one of these elements, it can simply lift the URI from the applicable attribute.

Another enhancement here is that a UNIX-style glob was provided (the include/@glob attribute). However, XProc works with regular expressions with regards to including/excluding files, so this glob had to be converted into an XPath regular expression. This was added in the _pattern attribute.

One last thing about these kinds of data documents is that XProc makes it very easy to setup support for includes. Create an include structure using XInclude <xi:include href="…"/> elements (the namespace prefix xi is bound to the XInclude namespace http://www.w3.org/2001/XInclude). In XProc, the <p:xinclude/> step does all the work for you and flattens the document by resolving all the includes. That’s all there is to it.

When processing the resulting job ticket, we need take decisions about what to do, in my case based on element names. In XSLT this would be rather simple: call <xsl:apply-templates> and write a template for all applicable elements. However, XProc does not have such a mechanism. You’ll, unfortunately, have to write dispatching code like this:

<p:for-each>
  <p:with-input select="/*/*"/>
  
  <p:choose>
    <p:when test="/*/self::copy-data">
      …
    </p:when>
    <p:when test="/*/self::copy-schemas">
      …
    </p:when>
    <p:when test="/*/self::build">
      …
    </p:when>
    <p:otherwise>
      … (error)
    </p:otherwise>
  <p:choose>

</p:for-each>