Markup UK 2023 Proceedings

Table of Contents

XProc as a command-line application engine
What is XProc?
The problem domain
Is XProc a suitable language?
Design pattern 1: Job tickets
The job ticket
Design pattern 2: Command line wrappers
Wrap-up and conclusions
Working with XML inside a web browser
Project Goals
What new XML technologies are required?
Technical Challenges
The XML data model layer
The XML user interface layer
The XML program logic layer
The resulting architecture of the XML edge platform
The two projects
Future development for the two projects
Markup UK Proceedings as CSS
DocBook Stylesheets
Comparing XSLT 1.0 and xslTNG Stylesheets
Build system
Syntax highlighting
XSLT debugging
Cover pages
PDF bookmarks
Current status
Enhancing Markup Quality Assurance with Automated Schema Visualization
Background and Related Work
Introducing the XSD Visualizer Plugin
Key Features for Assessing Schema Quality
Visualizing Inheritance Structure
Providing Effective Element and Type Structure
Jump-to-Code for Editing
Future Development
Enhancing the Schema Quality Workflow with Work-in-Progress Features
Tree View for Exploring the Actual Structure
Composite View for Understanding Complex Composition Structures
Implementation Details of the XSD Visualizer Plugin
Feedback and Future Development
Support for Additional Schema Formats
Potential Name Change to "SchemaViz"
Port to Visual Studio Code
HTML Export with SVG Graphics
Improving quality-critical XML workflows with XProc 3.0 pipelines
Introduction and background
About Thieme Compliance GmbH and patient education leaflets
About <xml-project /> and XProc
Introduction to existing batches
Batch “fragengruppe_2_evidence”
Batch “fragengruppe_2_FHIR-Questionnaire”
Pain points of the existing batches
Lacking of flexibility for inserting additional XSLT steps (in between)
No easy way to debug the intermediate results of each XSLT step
Too many tools means too many dependencies
New requirements for next version
Future-proof approach and improved maintainability by adding a separate orchestration layer
Increased quality through validation of XML sources using T0 XSD as well as validation of XML results using specific versions of T0 DTD
Increased quality by additional validation of XML results using Schematron
Summarised, formatted and easily comprehensible log files
Performance improvement by omitting unnecessary images from the Zip archive
Limiting processing to specific sources from the source folder
New system based on XProc 3.0
Smooth transition to XProc 3.0
MorganaXProc-IIIse worked well and could even be improved over the course of the project
Serialisation is now done by MorganaXProc and no longer by Saxon
Performance problems with FHIR XML schema
XProc pipeline optimisation by loading stylesheets only once at the beginning
Feature request for XProc: please add <p:validate-with-dtd>
Bridging the Gaps Between XML and TEX
The Gaps Between XML and LaTeX
Methods to Convert XML to LaTeX
An Alternative Approach
Math Transform MathML to TeX
Transform XML to TeX
Building a cloud-based visual operating system entirely based on XML
Finding a cure to the chaotic software landscape
What is CloudTop really?
Changing the perception of a computer
Verifying our assumptions
Looking through a few of the sample applications built
XMLPad - Data Manipulation and Transactions
Kanban - Hierarchical Data Model
Contacts - Key/values, Meta-data, and Datatypes
CloudTop - Combining Applications into a Desktop
Using TDD to produce High Quality XSLT
TDD : where does it comes from ?
The TDD loop and the refactoring phase
Writing a MarkDown to HTML converter with XSLT
Feature definition
Level 1 titles
Next titles and list items
Pro and Cons of using TDD for XSLT development
Baby steps
Code Coverage
Data Coverage
Word processing is so last century
Executive Summary
What's computer-assisted sense making?
Working with multiple languages, perspectives, and ontologies
How prodoc helped make sense of big words
Lessons learned
The path to prodoc
Bots hate word processing's ornery visually-oriented ontology
Semantic markup to the rescue
Separating structure and style
Integrating lifecycle perspectives
From semantic markup to semantic authoring
Individual impacts
Organizational impacts
Document-level controls to capture and communicate meaning
Author-driven structural changes
Author-driven visual changes
prodoc in practice
@class — Authored class styles
<awkbuddy/> — An interactive development environment block
<bbody/>, <branches/>, <branch/> — Hierarchical tables
<colortest/> — Automating accessible color negotiation pipelines
<h/> — Depth-based headings because big headings are ugly
<kfam/> — An element and generalized design language to make sense of knowledge flows from multiple perspectives
kfam conceptual language
kfam markup language
kfam visual language
kfam modeling
<music/> — Rationalizing chord/ lyric pairings
<vcanvas/> — Visualizing and comparing sets of value optimizations
WordNet and SUMO integrations — Associating markup with dictionaries & formal logic
Findings & next steps
Bottom-up negotiations to define shared meanings
h1.dtd — Build your own prodoc
h1.dtd summary
Leveraging the Power of OpenAI and Schematron for Content Verification and Correction
Artificial Intelligence
Generative Pre-trained Transformer(GPT)
Schematron and AI
Schematron Quick Fix and AI
Implementation of AI in Schematron
Examples of AI-driven Schematron and SQF Solutions
Check text consistency
Check text voice
Answer to question
Check the number of words
Check if block of text should be a list
User-Entry - Check technical terms
Generate Fix Automatically
Develop Schematron using AI
XQS: A Native XQuery Schematron Implementation
Design goals
Conformance over performance
Dynamic evaluation
Expansion and inclusion
Mandated XQuery QLB
Context is everything
Document level
Node level
Assertion level
Dynamically evaluated schema
Evaluating patterns
The documents attribute
Rule processing
Advisory notes
Compiled schema
Other features
User-defined functions
Evaluating schema components
Maps, arrays and anonymous functions as variables
Unit testing
Status of the work
The conformance suite
Future work
Quality in Formatted Documents
Markup Quality
Formatted Quality
Regression testing
Automated analysis
PDF/UA checking and remediation
A Dependency Management Approach for Document and Data Transformation Projects
Introducing Apache Ivy
Starting with Arousa
A bold experiment. How much time/effort does it take?
Introducing the Arousa project structure
Ivy abstraction. Introducing artifact types
A Key difference with Maven
Configuration chains
The Ivy Cycle.
Ivy flexibility (resolvers)
Working with “Others”, the Dual resolvers
A step further
An advanced example
Tool-Based Transformations
Project steps and phases
Exploring Data
Off-the-shelf tools and ad-hoc tools
Analyzing Existing Code
Doing The Actual Work

List of Figures

1. Abstraction based on XML in all areas is key.
2. XML applications required to be developed
3. XML-based data model abstraction using containers and objects
4. Mapping relational data to the XML data model
5. Cloud-based support in the form of an XML repository serving the data model
6. Rendering of the application view
7. Visualization of the logic XML language created inside a web-browser
8. Architecture of the XML Device Edge Application Platform
9. Uniquely leveraging a CDN to distribute XML software applications for low-latency
1. Paper formatted using CSS (left) and XSL-FO (right)
1. Patient education leaflet
2. “Anamnese mobil” app from E-ConsentPro
3. Batch “fragengruppe_2_evidence”
4. Batch “fragengruppe_2_FHIR-Questionnaire”
5. A bird's eye view of the new system
6. Summarised HTML log
7. Serialisation done by MorganaXProc (left) and Saxon (right)
1. xmltex inputs and outputs
2. Passive TeX transformation
3. xml2tex conversion pipeline
1. CloudTop cloud-based desktop and applications built entirely in XML.
2. The XMLPad application.
3. The tutorial Kanbon application.
4. The System Manager.
1. Sowa Hexagon
2. Formalized ontologies
3. Semi-formalized, markup-based ontologies
4. Conrad corollary
5. Word processing
6. Typesetting lookup table
7. Generalized markup
8. Authoring meaning through content, structure and style
9. %divs; declarations includes the %sa.divs; extension mechanism for use in documents
10. Document header, where new elements are added through the %sa.divs; interface
11. @class markup
12. @class CSS
13. @class rendering
14. <awkbuddy/> codeblocks
15. table/@display="table" (columns displayed for data processing)
16. table/@display="block" (columns hidden for tree processing)
17. <bbody/> markup
18. Color testing and tuning pipeline
19. CSS to scale headings based on document depth
20. Knowledge flows associated with knowledge enabling behavior
21. <kfam/> workspace
22. Named foreground and background colors
23. @k@kstyle CSS style mappings
24. @kstyle variations applied to a table
25. Knowledge gap analysis
26. @k values and associated colors
27. Lyric-chord charts with misaligned text (it gets much worse, especially with proportional fonts)
28. <lyric/> lines, with lyric fragments <l/>, and chord <c/>symbols next to each other so they don't get lost
29. CSS rendering with fully-automated offsets
30. music[@view="edit"] markup
31. music[@view="edit"] rendering
32. Rendering of <vcanvas/> datasets
33. Semantic formalization attribute list
34. Elements mapped to WordNet definitions and SUMO logic
35. SUO-KIF definition
36. From semi-formalized markup languages to fully-formalized ontologies
37. Community engagement process
1. Schematron processing steps: expansion, inclusion and compilation
2. Schematron processing: "dynamic evaluation"
3. Schematron schema structure
1. FreqX Report for Element Counts
2. FreqX Report for Attributes
3. FreqX Report for Attribute Values
4. FreqX Report for Attribute Values
5. Eddie 2 Report

List of Tables


List of Examples

1. XSpec 1
2. XSLT 1
3. XSLT 2
4. XSLT 3
5. XSLT 4
6. XSLT 5
7. XSpec 2
8. XSLT 6
9. XSLT 7
10. XSpec 3
11. XSLT 8
12. XSpec 4
13. XSLT 9
14. XSLT 10
15. XSLT 11
1. Example of XSLT functions that calls ChatGPT
2. Example an extensions function ai:verify-content()
3. Example an extension function ai:transform-content
4. Example a Schematron rule that verifies if the text is easy to read and understand
5. SQF fix that corrects the text to be easy to read and understand
6. Rule that verifies if the text voice is active
7. SQF fix that that reformulates the text to use active voice
8. Rule that verifies if the text does answer to a specific question
9. SQF fix that that reformulates the text to answer to the question
10. Rule that verifies the number of words from the shortdesc element
11. SQF fix that reformulates the phrase to have less than 50 words
12. Rule that verifies the text from a paragraph should be converted to a list
13. SQF fix that creates a list from a set of phrases
14. Rule that verifies if the technical terms are explained adequately
15. SQF fix that allows the user to specify the prompt that will be send to the AI
16. Example: A Schematron rule that verifies the number of words from shortdesc element