Workflow configuration has a significant impact on System performance and the load on the hardware. Consideration above is given to the load, produced by the default workflow that contains Pre-processing, Recognition, Verification, and Export stages.
To fit the requirements of specific projects, you can add more processing stages, reorder them and setup sophisticated routing rules. You need to keep in mind the following:
- Avoid too many stages
Each stage increases the volume of resources required – download the data to be processed, get someone to perform processing and return the processing results back to the server – and, hence, the total project cost.
For example, if you are going to add a new custom stage for an automatic script, consider the possibility of executing this script using rules, or predefined events, or of combining it with another existing stage.
- The slowest stage limits the performance
Typically, the slowest stages are those that require manual work. It is less obvious that even in unattended processing bottlenecks may appear, caused by non-optimal custom scripts or slow access to non-cached external resources.
Observe queues at stages along the workflow using the Administration and Monitoring Console to identify the slowest stage. Consider the possibility to speed up the stage or at least to parallelize the processing using the “Documents per Task” option in stage properties.
- Do not produce tasks that are too small when parallelizing processing at a stage
When you parallelize processing at a stage, avoid splitting the processing into too many pieces; handling each piece will require additional work from the System. In particular, a huge number of very small automatic tasks may slow down the Processing Server that distributes each task between executors.
If you need to speed up a stage by just a factor of two and you have typically 10 documents in a batch, it is already sufficient to create a task for 2 sets of 5 documents each instead of one task for the entire batch as by default. However, try to avoid creating one task per document, when you actually do not need this.
Please also remember that creating a task that is smaller than a batch limits the executor’s agility: if a verifier in some scenarios may work with each document independently, then for automatic document assembly it is critical to have all the pages of one batch as one task.