Background


Prev		Next

Before we had XML, we had SGML, which is the ISO standard techreport on which XML is based. By default, SGML is pointy-bracket markup like XML…but in SGML almost everything can be redefined, including the markup characters themselves, and there are numerous options for additional features and for abbreviations and markup shortcuts to minimise typing. Any changes to syntax or to the configuration value limits have to be made in a Declaration file, and a DTD is compulsory on every document.

Notice the words to minimise typing. In the beginning, there was no software with an editing interface that could use the DTD to provide a contextual menu of available element types; and the idea that the markup would be hidden from the user was counterintuitive — if you couldn’t see the tags, how could you know what was marked?

In these early stages, therefore, markup was applied by typing it in a plaintext editor, so the essential piece of software to begin with was the parser, not the editor, so that you could check that you hadn’t got something wrong. The term parsing was often used to mean parsing-and-validating; ^[1] as there was no concept of well-formed tag validity in the sense introduced with XML. There were many early parsers; among the most significant were:

ARC SGML (Almaden Research Center) originally by Charles Goldfarb; later developed by James Clark into sgmls (see below)
ASP SGML (Amsterdam SGML Parser), still available article
Exoterica by Sam Wilmott (later included in Omnimark)
the parser in Framemaker+SGML by Lynn Price
a parser for Boeing (internal only) by Greg O’Connell and Debbie Lapeyre
Mark-It! by Jean-Pierre Gaspard
sgmls by James Clark, the only one still in widespread use; redeveloped as nsgmls for SP, and now as onsgmls to handle XML for OpenSP

Other software developed rapidly, spurred partly by the adoption of SGML for some military documentation in the US and elsewhere, and partly by its growing use in publishing, research, and academia. Editing software included:

Arbortext ADEPT (through several name changes (eg Epic), now PTC Arbortext Editor)
SoftQuad Author/Editor and the editors based on it, HoTMetaL (for HTML) and later, XMetaL (for XML)
STiLO Document Generator, with Arbortext one of the few to handle mathematics in a general-purpose SGML editor
Emacs with psgml-mode
epcedit, a free SGML and XML editor from tkSGML
the Euromath Editor, an EU project built on the GriF editor^[2]
Siemens Nixdorf InContext
Citec MultiDoc Translating Editor
Microstar Near&Far Author for Word and Near&Far Designer, a graphical DTD editor
GriF SGML Editor
Richard Light’s SGML Tagger (OUP), a memory-resident monitor for MS-DOS editors.
Corel WordPerfect had a built-in SGML editor
Sema Write-It! (using Mark-It! as the parser)

Documents also need processing in some way: adding to a database, putting on the Web, mining it for data, or converting it for a formatting system for publishing. Conversion or processing (transformation) systems included:

AIS Software Balise
DFN DAPHNE (VMS only; converted to TeX)
EBT (later Inso) DynaText trainable converter from Word to SGML
James Clark’s Jade (using DSSSL) can convert to TeX and other formats
Exoterica Omnimark (XTRAN)
Microsoft SGML Author for Word, despite its name, this was not an editor, but a converter into and out of Word

There were several standalone viewers, especially for vertical-market applications, but few general-purpose browsers. As with editors, some used SGML-syntax stylesheets to format the display; others used proprietary stylesheet syntax. Formatting systems for printed output typically produced Postscript (pre-PDF days). Some handled SGML input direct, others via an established conversion route; output was formatted using TeX or a proprietary typesetting engine.

Advent 3B2 typesetter
EBT DynaWeb NT server for documents converted with DynaText
Adobe Framemaker+SGML typesetter (FTC’s original had no SGML support)
LaTeX, typesetter, usually via transformation through Omnimark, Balise, Jade, or similar
Citec MultiDoc Pro Publisher standalone browser
Panorama Viewer, an SGML plugin for the Mosaic and Netscape browsers; also the standalone Panorama Publisher
Arbortext Publisher typesetter

There was far more software available which is outside the scope of this report — some of it is now either uncompilable or uninstallable, or was in any case incomplete or experimental at the time. A significant amount was normal commercial software which has suffered the conventional fate of being superseded, falling out of use, or being abandoned when the company failed or was taken over. There are extensive lists of both free and commercial applications in Robin Cover’s SGML/XML Web Pages, and some of the SGML Conference CDs have a considerable amount of freely-distributable and commercial-sample software in subdirectories..

Other categories not covered here include design tools, search engines, and databases. The only three of these of which this author has direct experience (noted below) were Microstar’s Near&Far Designer, Tim Bray’s PAT search engine in the section called “PAT □”, and the SGML DARC document management database in the section called “SGML Darc ☑”.

^[1]In fact, in the authors’ description of the Amsterdam SGML Parser article, the only instances of the term validation are in the formal references to validation services in the SGML standard itself.

^[2]The editor is reputedly being resuscitated and rebuilt using INRIA’s Thot structured editor.