Apprentissage avec mise en clusters

During training, ABBYY FlexiCapture places incoming documents into their appropriate training batches associated with their originating vendors. Typically, documents from the same vendor will have similar layouts, which means that you can train a FlexiLayout and use it at the verification stage. If documents originating from the same vendor have widely varying layouts, you should use the clustering feature. When the clustering feature is turned on, ABBYY FlexiCapture for Invoices will automatically analyze documents and put them into groups (termed “clusters”) based on features that they have in common. A separate FlexiLayout will be created for each cluster.

The clustering feature is turned on by default. To disable clustering,complete the following steps:

  1. In the Document Definition editor, click Document Definition → Propriétés de la définition de document....
  2. In the dialog box that opens, click the Paramètres de définition de document tab.
  3. Click the Éditer... button to the right of the Champs et caractéristiques supplémentaires group.
  4. In the Fonctionnalités de définition de document dialog box, clear the Activer la mise en cluster option.

The training documents will be placed into the batch associated with their respective vendor. If the clustering feature is turned on and you receive documents with widely varying layouts from the same vendor, documents from this vendor will be clustered inside the training batch used for this vendor. A separate FlexiLayout will be trained for each cluster. Training will be initiated once a cluster contains at least one document. Please note that clustering is a fully automatic process and the actual clusters remain invisible to the user.

If you have no vendor databases but still want to use field training, you can accumulate company information by adding records to your data sets while capturing invoices. For more information, see Looking up vendors and business units in the database.

When training is in progress, a FlexiLayout is created. Please note the following:

  • If the clustering feature is turned off, documents will be placed into their appropriate training batches used for their respective vendors and a FlexiLayout will be created for each vendor.
  • If the clustering feature is turned on, documents will be clustered inside the training batch and a FlexiLayout will be created for each cluster.

Remarque : When updating a project created in an earlier version of ABBYY FlexiCapture, you can use your existing FlexiLayouts without any modifications. However, once you use training with clustering, the clustering algorithm will redistribute your documents among the training batches and a new FlexiLayout will be created for each cluster.

If you are not satisfied with the processing results the program delivers on documents from a particular vendor, you can create your own FlexiLayout or export the trained FlexiLayout and modify it in ABBYY FlexiLayout Studio.

You can import a modified or a completely new FlexiLayout into a training batch to be used for one specific vendor (for details, see Training by users with project setup permissions).

If you are also using the clustering feature, please note the following limitations:

  • If you are creating a new FlexiLayout manually, make sure that it covers all the possible document variants originating from the given vendor — you cannot manually create a FlexiLayout for one cluster only.
  • Only a FlexiLayout for the main invoice fields will be exported. No FlexiLayout can be generated and exported for line item fields, as this type of field uses a separate machine learning algorithm, whose results cannot be exported or modified. However, you can still create a FlexiLayout for line item fields manually.
  • If the clustering feature is enabled, only the FlexiLayout trained for the first cluster will be exported.
  • After you import a new or modified FlexiLayout into your training batch:
    • There will be no training while processing documents.
    • Clustering will be disabled for this batch.
    • The imported FlexiLayout will be used for processing all documents from this vendor, regardless of their cluster.

14.01.2021 14:17:20


Please leave your feedback about this article