Training with clustering

During training, ABBYY FlexiCapture places incoming documents into their appropriate training batches associated with their originating vendors. Typically, documents from the same vendor will have similar layouts, which means that you can train a FlexiLayout and use it at the verification stage. If documents originating from the same vendor have widely varying layouts, you should use the clustering feature. When the clustering feature is turned on, ABBYY FlexiCapture for Invoices will automatically analyze documents and put them into groups (termed “clusters”) based on features that they have in common. A separate FlexiLayout will be created for each cluster.

The clustering feature is turned on by default. To disable clustering,complete the following steps:

  1. In the Document Definition editor, click Document Definition → Document Definition Properties....
  2. In the dialog box that opens, click the Document Definition Settings tab.
  3. Click the Edit... button to the right of the Additional Fields and Features group.
  4. In the Document Definition Features dialog box, clear the Enable clustering option.

The training documents will be placed into the batch associated with their respective vendor. If the clustering feature is turned on and you receive documents with widely varying layouts from the same vendor, documents from this vendor will be clustered inside the training batch used for this vendor. A separate FlexiLayout will be trained for each cluster. Training will be initiated once a cluster contains at least one document. Please note that clustering is a fully automatic process and the actual clusters remain invisible to the user.

If you have no vendor databases but still want to use field training, you can accumulate company information by adding records to your data sets while capturing invoices. For more information, see Looking up vendors and business units in the database.

When training is in progress, a FlexiLayout is created. Please note the following:

  • If the clustering feature is turned off, documents will be placed into their appropriate training batches used for their respective vendors and a FlexiLayout will be created for each vendor.
  • If the clustering feature is turned on, documents will be clustered inside the training batch and a FlexiLayout will be created for each cluster.

Note: When updating a project created in an earlier version of ABBYY FlexiCapture, you can use your existing FlexiLayouts without any modifications. However, once you use training with clustering, the clustering algorithm will redistribute your documents among the training batches and a new FlexiLayout will be created for each cluster.

If you are not satisfied with the processing results the program delivers on documents from a particular vendor, you can create your own FlexiLayout or export the trained FlexiLayout and modify it in ABBYY FlexiLayout Studio.

You can import a modified or a completely new FlexiLayout into a training batch to be used for one specific vendor (for details, see Training by users with project setup permissions).

12/1/2020 7:03:59 AM


Please leave your feedback about this article