Requirements

A data worker is defined as a running program which manipulates data. It can get data from various sources then, possibly, transform it and save, or send, the resulting data. A common usage for data workers is building interfaces between applications, as ETL systems do at a larger scale.

Specifically, a data worker is to be a minimal program for a basic operation while multiple data workers can be active at the same time.

When data volume is not huge and processing time can be deferred, for example not during active hours, performance is just to be considered when execution is perceived to be far too slow. Input/output operations are probably, anyway, the most time consuming ones.

Data workers should be deployable on different machines with different operating systems, even small ones such as RaspBerry Pis.