Sample 1. Step 7: Creating a document identifier
When processing semi-structured documents in ABBYY FlexiCapture, one would normally wish to exclude documents not belonging to the current type. One way to identify a document is to mark at least one element as required. A required element must be consistently detected on all the documents of a given type (otherwise, the program will not be able to match the document and its FlexiLayout).
In this particular case, the document heading (HALLOWEEN REGISTRATION FORM) will make a good identifier element, since it contains distinct text that can be easily read by the OCR engine.
Note.You can specify an identifier element or set of elements in a predefined compound Header element (not described in this tutorial).
The document heading will be used solely to identify the document as belonging to the given type. In the FlexiLayout, specify the document heading as an element of type Static Text.
To create an ID element:
- Click the FlexiLayout tab in the program main window.
- Select SearchElements in the FlexiLayout tree.
- Select the Static Text command in FlexiLayout → Add element or in the shortcut menu of the element (New → Static Text).
- In the Name field, type a name for the element, e.g. IDHeader.
- Select Required element on the General tab to make the document heading a required element.
Show me...
- Click the Static Text tab.
Show me...
- In the Search text field, type the text to find: HALLOWEEN REGISTRATION FORM.
Judging by the first image in the batch, one can assume that the document heading is written in one line. Therefore, you can type the heading without spaces to speed up looking for single-line static text. - Set the maximum number of errors that may occur in the found text (either in percentage points or as a number). In this particular case we recommend setting the Max error percentage at 20, allowing 5 errors for 25 characters of the document heading.
Note.The optimal percentage of allowed errors can only be found by trial and error.
12.04.2024 18:16:02