If there is a method of asking a RELAX NG validator to use
a regular expression case insensitively, I do not know it. Thus
the regular expression is written case sensitively. E.g., the
sub-pattern [A-Za-z]
occurs frequently where
[A-Z]
would be acceptable if the pattern could be
applied after case folding.
It is slightly advantageous to be absolutely explicit about which characters are allowed, so in one sense this verbosity is an advantage. On the other hand, there are two significant disadvantages:
It adds verbosity. The generated regular expression is more than 1000 characters longer than it would be if case insensitivity could be assumed.
It means that the generated regular expression
is in some cases technically incorrect. For example,
[CSS3] defines a pseudo-class
:link
. It never mentions a pseudo-class
:LINK
, and I have never seen it used in
uppercase in the real world or in a test suite. However, section
3 says quite clearly “All Selectors syntax is
case-insensitive within the ASCII range (i.e. [a-z] and [A-Z]
are equivalent)”. Thus either matching should take place
case insensitively, or wherever the regular expression says
link
it should really say
[Ll][Ii][Nn][Kk]
.