Naming and renaming

The modularisation proposed here uses a new feature of ixml: renaming, a feature agreed by the working group, but not yet part of the official specification; it is specified in the current working draft [wd] and already present in several implementations. It allows you to specify for a rule a different name than the default to be used on serialisation.

To illustrate: an ixml rule has a name. Up to now in ixml, this specifies a name both for the allowable input syntax, as for the name used in the output serialisation for that rule. If two input forms have different syntaxes, it is therefore necessary to give them different names, even if the intention is to have the same output serialisation.

For instance, consider a grammar that accepts both 31/12/1999 and 31 December 1999 forms of dates:

             date: numeric; textual.
         -numeric: day, -"/", month, -"/", year.
         -textual: day, -" "+, tmonth, -" "+, year.
              day: d, d?.
            month: d, d?.
             year: d, d, d, d.
           tmonth: -"January",  +"1";
                   -"February", +"2";
                   ...
                   -"December", +"12".
               -d: ["0"-"9"].
      

What you will see is that the serialisation of these are nearly identical, except that while 31/12/1999 produces

         <date>
            <day>31</day>
            <month>12</month>
            <year>1999</year>
         </date>
      

31 December 1999 produces

         <date>
            <day>31</day>
            <tmonth>12</tmonth>
            <year>1999</year>
         </date>
      

where the difference is because it is produced from a different input syntax. Using renaming, you can specify that both have the same serialised name:

         tmonth > month:
            -"January",  +"1";
            -"February", +"2";
            ...
            -"December", +"12".
      

This says that while tmonth is the name used in the grammar, and represents the textual form of a month in the input, it should be serialised as month, thus in this case making the two date serialisations identical.

Incidentally, since the allowable ixml names are not exactly the same set as the allowable XML names, you can also specify the renaming as a string. For instance since ixml names may not end with a dot, but XML names may, you can write:

         abc > "abc.": ...
      

The syntax of the start of a rule like this is called a naming, and can consist either of a name, as currently in ixml, or a renaming, which consists of a name, a greater than, and an alias, which can either be a name or a string.

Also in passing, it is worth noting that this has consequences for round-tripping, as presented in [rt], since this introduces a roundtripping ambiguity. Because an output form such as

         <date>
            <day>31</day>
            <month>12</month>
            <year>1999</year>
         </date>
      

can have been produced by two different input syntaxes, the roundtripping process has to choose one of them. Where necessary this can be overcome with a technique such as:

         tmonth > month:
            style, 
            (-"January",  +"1";
             -"February", +"2";
             ...
             -"December", +"12").
         @style: +"text".
      

which would produce for the 31 December 1999 style of input

         <date>
            <day>31</day>
            <month style='text'>12</month>
            <year>1999</year>
         </date>
      

which can be uniquely round-tripped.

With this background explained, we can now proceed to the design of modularisation.