Layout is extremely important for the comprehensibility of the code.
2+3 * 1+2
is not
15
(what the spacing suggests) but 7
. Or what about this coding
horror:
<xsl:function name="local:something"> <…></xsl:function><xsl:function name="local:something-else"><…> </xsl:function>
What are the things we can do to enhance comprehensibility? First a few tips about expressions and the likes:
Use parentheses abundantly. Write (2+3)*(1+2)
if that’s what you
mean. Parentheses help to communicate the intention of your expressions.
Choose a style about where to add spaces. Do you write: (2+3)*(1+2)
or
( 2 + 3 ) * ( 1 + 2 )
?
Format a function call like add(1,2)
or add(1, 2)
or
add( 1, 2 )
… Again, pick something, stick to it.
If there’s any syntactic sugar in your language that makes expressions easier
to read, by all means, use it. For instance, since version 3.0, XPath has the
=>
operator. So instead of:
lower-case(normalize-space(translate($name, '\', '/')))
We can now write the, IMHO much clearer:
$name => translate('\', '/') => normalize-space() => lower-case()
But what is, in my eyes, most important is the code’s “rhythm”. When you look at some script or module, can you easily see how it’s structured? Where things begin and end? What belongs to what? Here are my tips:
Choose an indentation style and try to keep this consistent.
For XML based languages like XSLT that's easy: indenting of XML is already (unofficially) standardized. The only thing we can bicker about is how many to indent and whether to use spaces or tabs (I personally prefer an indent of two spaces and not to use tabs).
For text based languages like XQuery it’s a little harder. I’ve done a
lot of XQuery programming lately and noticed I couldn’t come up with a consistent
and simple enough indentation strategy that works in all situations. Especially for
XQuery, make sure that it’s clear what code belongs to what FLWOR expression or
if/then/else
branch. Whatever you choose, it’s always making the
intention of the code clear that matters most.
Open some code module in your editor, scroll a bit, and look at it as from afar. Can you see how it's structured? What the building blocks are? When you’re looking for some template/function/section in the code, can you quickly find it? And in a glance see where things start and end?
Clearly separate the main parts of your code. For instance, an XSLT program consists mainly of templates. Make it very clear where one begins and ends. Create a sectioned layout, like chapters/sections in an article or book.
I personally prefer using “comment lines” for this:
<xsl:template …> … </xsl:template> <!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - --> <xsl:template …> … </xsl:template>
And use a “heavier” line in between sets of templates that belong together:
<!-- ======================================================================= -->
And no, it’s not awkward to insert (type) these kinds of lines all the time, see the section called “Debunking some obstacles”. But you can also decide to use multiple empty lines. Or whatever works for you.
Use empty lines and comment headers to give code inside a template/function/… a “rhythm”:
(: Initialize: :) … (: Compute the value for … :) … (: Write it to disk: :) …
Splitting code in blocks like this serves two main purposes, analogue to paragraphs in prose:
They force you to think more structured and do the things that belong together together.
Somebody which is new to the code can more easily grasp what it’s about (especially when the comment headers are helpful, see the section called “Comments? What comments?”).
The code blocks should’t be very long, max. 10 to 20 lines, preferably shorter. Just a single line is ok if it serves the purpose.
Use a maximum line width to prevent your lines from overflowing/wrapping and making them hard to follow. Older books about software engineering advocate 80 characters. Given our modern big screens and the tendency not to print things, I personally prefer ~150.
Using a line width can even flag incomprehensibility issues! When a piece of code starts regularly overflowing this, it’s usually a clear indicator the code is too complex and you’d better refactor/split it…