Export the data from the app's relational database and aggregate it into hierarchical XML records

Timing - ​The archiving scenario: at the point of time of an app decommissioning, we could likely still find people in the company who know the data. Maybe that knowledge will disappear later on, so doing the aggregation as early as possible is an advantage.

Complexity -​ It's a low-level approach, so a reasonable level of understanding the database schema is necessary. It depends on the application, of course. Sometimes it's also possible to work with "aggregation tables",​ which at least partly aggregates the data in the relational database, which makes our job easier.

Example 3. Travel reports:

In one project, we had to aggregate and archive travel records - similar to the example I used earlier in this paper. It has a header (date, employee...) and transaction items in the body.

It's quite simple on the database schema level:

  1. Main travel table - one record per trip.

  2. Transaction table - one record per transaction.

  3. Attachment table - one record with the attachment metadata and a column with content, for ex. PDF, stored as BLOB.

  4. Approx. a dozen other tables were "​register tables" . For example, based on the employee ID we could look up the employee's name from another table.

All in all, we used about 15 tables, and aggregated the records with XQuery that was not so complicated, less than 100 lines and the same patterns were repeating. More about this later.