The modularisation proposed here uses a new feature of ixml: renaming, a feature agreed by the working group, but not yet part of the official specification; it is specified in the current working draft [wd] and already present in several implementations. It allows you to specify for a rule a different name than the default to be used on serialisation.
To illustrate: an ixml rule has a name. Up to now in ixml, this specifies a name both for the allowable input syntax, as for the name used in the output serialisation for that rule. If two input forms have different syntaxes, it is therefore necessary to give them different names, even if the intention is to have the same output serialisation.
For instance, consider a grammar that accepts both
31/12/1999
and 31
December
1999
forms of
dates:
date: numeric; textual. -numeric: day, -"/", month, -"/", year. -textual: day, -" "+, tmonth, -" "+, year. day: d, d?. month: d, d?. year: d, d, d, d. tmonth: -"January", +"1"; -"February", +"2"; ... -"December", +"12". -d: ["0"-"9"].
What you will see is that the serialisation of these are nearly
identical, except that while 31/12/1999
produces
<date> <day>31</day> <month>12</month> <year>1999</year> </date>
31
December
1999
produces
<date> <day>31</day> <tmonth>12</tmonth> <year>1999</year> </date>
where the difference is because it is produced from a different input syntax. Using renaming, you can specify that both have the same serialised name:
tmonth > month: -"January", +"1"; -"February", +"2"; ... -"December", +"12".
This says that while tmonth
is the name used in
the grammar, and represents the textual form of a month in the
input, it should be serialised as month
, thus
in this case making the two date serialisations identical.
Incidentally, since the allowable ixml names are not exactly the same set as the allowable XML names, you can also specify the renaming as a string. For instance since ixml names may not end with a dot, but XML names may, you can write:
abc > "abc.": ...
The syntax of the start of a rule like this is called a
naming
, and can consist either of a
name
, as currently in ixml, or a
renaming
, which consists of a name, a greater
than, and an alias, which can either be a
name or a string.
Also in passing, it is worth noting that this has consequences for round-tripping, as presented in [rt], since this introduces a roundtripping ambiguity. Because an output form such as
<date> <day>31</day> <month>12</month> <year>1999</year> </date>
can have been produced by two different input syntaxes, the roundtripping process has to choose one of them. Where necessary this can be overcome with a technique such as:
tmonth > month: style, (-"January", +"1"; -"February", +"2"; ... -"December", +"12"). @style: +"text".
which would produce for the 31
December
1999
style of input
<date> <day>31</day> <month style='text'>12</month> <year>1999</year> </date>
which can be uniquely round-tripped.
With this background explained, we can now proceed to the design of modularisation.