Adding new Cars to a Running Train

Adding markup for multilingual documents to JATS

Deborah Aleyne Lapeyre

Senior XML/XSLT/Schematron Consultant
Mulberry Technologies, Inc.

B. Tommie Usdin

President/Senior XML/XSLT/Consultant
Mulberry Technologies, Inc.

Abstract

The Journal Article Tag Suite (JATS) is the ANSI/NISO standard that is defines the tag set used to archive, exchange, and publish journal articles worldwide. JATS 1.4 was approved (as JATS version 1.4 ANSI/NISO Z39.96-2024) on October 31, 2024. The most exciting new capability added in JATS 1.4 is a way to encode multi-language journal articles. JATS defines a multi-language article not merely as an article written in English or German with a couple of quotations in Latin or French. A true multilingual article contains substantial portions of the content written and presented in more than one language or contains some parts in one language and other parts in a different language or languages. The JATS tag set has always been able to encode articles in any language and to identify portions of an article that are in a different language from the containing article (using @xml:lang). However, JATS has not, until version 1.4, been able to encode (in a graceful way) articles that are in whole or large-part multilingual.

The JATS multi-language mechanism harks back to Architectural Forms, and is implemented mostly using attributes, which can identify:

  • an entire article in 2 or more languages

  • substantial portions of content (paragraphs and sections) replicated in 2 or more languages

  • block structures (such as figures, tables, and equations) replicated in 2 or more languages

  • articles where some of the content is in one language and some in another, and

  • article metadata (such as the article title or abstract) in multiple languages.

This paper describes the encoding available for many styles of multi-language functionality, provides examples, lists useful resources to learn more, and gives a few reasons why this mechanism may be of interest to you, even if you or your clients do not code in JATS or work with journal articles.


Table of Contents

What is JATS?
Requirements and Constraints
How Multi-language Documents are Presented and Stored
Requirements and Non-Requirements
Backwards Compatibility
The Needs of the Few Must Not Burden the Many
JATS Should Stay In It's Lane
I Don't Use JATS; Why Should I Care?
JATS Multi-language Mechanism (@lang-group)
Influence of SGML's Architectural Forms: Giving Credit where Credit is Due
A Few Examples
Simple Example of a Language Group
Did a Person or an Algorithm Translate This?
Not Everything Should be Translated
Doesn't This Mechanism Entail a Lot of Overhead?
Documentation for All This
Acknowledgements
Bibliography