How FlexiLayout matching results are merged
This article describes the order in which FlexiLayouts are matched with documents and how field regions are selected.
ABBYY FlexiCapture for Invoices can use several types of FlexiLayout when matching the Document Definition:
1. generic FlexiLayout, which is used to process all invoices and which itself may include two subtypes of FlexiLayout:
a. a main FlexiLayout, which is used to detect the standard pre-defined invoice fields (this FlexiLayout is provided together with ABBYY FlexiCapture for Invoices and cannot be modified);
b. an additional FlexiLayout, which is used to detect user-defined fields and standard invoice fields whose detection logic has been changed (this FlexiLayout is created by the user).
2. FlexiLayout variants, which are used to process invoices from specific vendors. FlexiLayout variants can be used to detect any fields defined in their respective Document Definition and can be created by the user or by ABBYY FlexiCapture for Invoices when it is trained on invoices from specific vendors.
After matching the Document Definition with an invoice, the program obtains a set of field regions, which come from the different FlexiLayouts included in the Document Definition.
1. Matching a generic FlexiLayout
A. First, the program applies the main FlexiLayout.
An invoice image is pre-recognized using the pre-recognition parameters specified in the Document Definition.
- The languages are specified under Countries and Languages on the Document Definition Settings tab) of the Document Definition Properties dialog box.
- The pre-recognition mode (Fast/ Balanced/ Normal/ Accurate) is specified on the FlexiLayout tab of the Document Definition Properties dialog box.
Results obtained by applying the main FlexiLayout
As a result of applying the main FlexiLayout, the program obtains:
- the ID of the vendor and the ID of the business unit, if detected
- the regions of the standard fields. (See Captured fields for details.)
B. Next, the program applies the additional FlexiLayout, if it is available in the generic FlexiLayout.
The invoice image is pre-recognized using the pre-recognition parameters (i.e. the languages and the pre-recognition mode) specified in the FlexiLayout. We recommend specifying in the additional FlexiLayout the same pre-recognition parameters as in the Document Definition. In this case the program will use the pre-recognition result obtained in step 1A instead of pre-recognizing the invoice twice.
If your invoices contain some unusual fields, you may want to specify in the additional FlexiLayout pre-recognition parameters that are different from those of the Document Definition, but this will slow down the processing.
Results obtained by applying the additional FlexiLayout
As a result of applying the additional FlexiLayout, the program obtains the regions of all the fields defined in the additional FlexiLayout. These may be either standard invoice fields whose detection logic had to be changed, or some additional fields not defined in the main FlexiLayout.
C. Next, the program generates the aggregate result of applying the generic FlexiLayout. If the Document Definition includes an additional FlexiLayout, at this stage the program merges the results obtained by applying the main FlexiLayout and the results obtained by applying the additional FlexiLayout.
The fields are identified by their names. The result is a collection of unique fields derived from the main and the additional FlexiLayouts. If there are fields with identical names in the main and in the additional FlexiLayout, the program will use the field region obtained by applying the additional FlexiLayout.
This approach allows you to define new fields or change the logic of capturing any of the standard invoice fields.
2. Applying FlexiLayout variants
If, by applying the generic FlexiLayout in step 1, the program managed to detect the vendor and there is a FlexiLayout variant for this vendor, this FlexiLayout variant is applied at this stage.
The pre-recognition parameters from the Document Definition are used. These are the same pre-recognition parameters that were used when matching the main FlexiLayout included in the generic FlexiLayout. Therefore, no additional pre-recognition is required.
Results obtained by matching the FlexiLayout variant
After matching the FlexiLayout variant with the invoice, the program obtains the regions of all the fields defined in the FlexiLayout variant for this specific vendor. Note that if a FlexiLayout variant is obtained through training the program on a variety of invoices, it will include all the fields defined in the Document Definition. And if a FlexiLayout variant is created manually, only those fields should be left in the variant whose detection requires some actions specific to the given vendor.
3. Merging the results
At this step, the results obtained by matching the generic FlexiLayout are merged with the results obtained by matching the FlexiLayout variant.
If the FlexiLayout variant was obtained through training the program on a variety of invoices, it includes all the fields defined in the Document Definition. The Document Definition stores information about the fields whose regions had to be changed by the user during training, i.e. the fields that were detected poorly by the generic FlexiLayout. The program will rely on this information when deciding which field regions should be taken from the generic FlexiLayout and which regions should be taken from the FlexiLayout variant trained on invoices from this particular vendor.
Note: The regions of the Amount fields are selected differently. When applying the generic FlexiLayout, the program will specify the level of confidence with which the regions of the Amount fields have been detected. If the regions are detected unreliably, the Total field will require verification. If there is a FlexiLayout variant trained on invoices from the given vendor and if the generic FlexiLayout fails to detect the Amount fields with a sufficient level of confidence or does not detect them at all, the regions of the Amount fields detected by the FlexiLayout variant will be used. You can also configure the program to always use the regions of the Amount fields detected by the FlexiLayout variant. To do this, set the value of the [HKEY_CURRENT_USER\Software\ABBYY\FlexiCapture\12.0\DAForms\]”UseTrainedInvoiceAmounts” registry key to true (the key is set to false by default).
If the FlexiLayout variant has been created manually, the program will use the regions of all the fields included in the FlexiLayout variant. The regions of the other fields will be taken from the results obtained by matching the generic FlexiLayout.
Information about which FlexiLayout was used to detect the region of a filed is recorded in the recognition log.
Lets us consider some of the fields defined in the Document Definition.
Consider two standard invoice fields, InvoiceDate and InvoiceNumber, and two-user defined fields, CustomFieldA and CustomFieldB.
- InvoiceDate and InvoiceNumber are standard fields and the detection algorithm for these fields is specified in the main FlexiLayout included in the generic FlexiLayout.
- CustomFieldA and CustomFieldB are user-defined fields and the detection algorithms for these fields are specified in the additional FlexiLayout included in the generic FlexiLayout.
Suppose the Operator is not satisfied with the quality of detection of the InvoiceDate and CustomFieldB fields on invoices from a specific vendor and trains the program on invoices from this vendor.
The field regions from invoices from this vendor will be merged as follows:
- InvoiceNumber - This region will come from the results obtained with the main FlexiLayout
- CustomFieldA - This region will come from the results obtained with the additional FlexiLayout
- InvoiceDate and CustomFieldB - These regions will come from the results obtained with the FlexiLayout variant generated by training the program on invoices from this vendor.
In the case of invoices from other vendors, the field regions will be merged as follows:
- InvoiceDate and InvoiceNumber - These regions will come from the results obtained with the main FlexiLayout.
- CustomFieldA and CustomFieldB - These regions will come from the results obtained with the additional FlexiLayout.