Training while processing documents
ABBYY FlexiCapture for Invoices lets you improve recognition quality while processing documents. If the program fails to detect the correct location of a field on a document image, you can specify its correct location and the program will use it when recognizing other documents.
Training is only available if ABBYY FlexiCapture can reliably identify the company by finding the corresponding record in a database. If you have no databases but still want to use field training, you can accumulate company information by adding records to your data sets while capturing documents. For more information, see Looking up vendors and business units in the database.
This article explains how to train ABBYY FlexiCapture for Invoices using the locally-installed version of the Verification Station, and covers some training-related issues that operators need to know.
To train the program while processing documents, complete the following steps:
- Collect a batch of documents (e.g. invoices processed within the past month) and start feeding them to the program. See How to capture invoices.
- Once the documents are fed to the program, they will be automatically recognized (this will happen only if the Recognize added images automatically option is enabled on the Document Processing tab of the Options dialog box; to open this dialog, click Tools → Options...) and the data will be checked by means of validation rules.
- If the status of a recognized document is other than Valid or if you have other reasons to believe that the program failed to detect some of the fields, open the document in the document editor.
- Review the document form. The Vendor/Issuer group of fields must be filled out correctly.
- Training will only be successful if the regions of all the fields are identified correctly, so you need to make sure that the regions match the actual locations of their respective fields on the image. For more information on how to mark up line items on a document, see Training line items.
To do this, in the image window of the document editor, adjust the regions or draw regions for those fields which the program failed to detect.
After that, the program will analyze the document. If the region markup has been modified and training for documents from this company is not prohibited, the document will be added to the batch.
How to change the region of a field
Note: The program will be trained on all the document's fields, not just on those whose regions you have drawn or adjusted.
- Open the next document and repeat steps 4 and 5.
- To initiate the training process, a training batch must contain at least one document. If clustering is used, a separate FlexiLayout will be created for each cluster; otherwise, a FlexiLayout will be created for each company (see Training with clustering for more information).
- The program will test trained FlexiLayout variant by applying it to all the documents in the training batch and comparing the results with the adjusted markup obtained in step 5. If the program determines that the trained FlexiLayout delivers better results than its earlier version, the trained FlexiLayout will be used next time you recognize documents belonging to this document variant.
If the program determines that the trained FlexiLayout variant delivers worse results than its earlier version, you will need to continue training it on documents from the given company (steps 4 and 5). The training process completes when the trained FlexiLayout variant can correctly identify all the field regions.