Using fn:transform() on each input file helped to control memory usage, and proved more effective than several other strategies tried. The input filename is passed to the external template, not the parsed tree.
FreqX provides a useful overview of a corpus, and now runs in a reasonable time.