Auto-creating a FlexiLayout
ABBYY FlexiLayout Studio lets you create FlexiLayouts by training the program on marked-up fields and static elements. Static elements are elements that appear on documents of the same type. The most common type of static elements is static text (a word, a part of a word, or a line of text), less common examples include separators and barcodes. Fields are blocks which the program has to detect in order to extract the text (a word, a part of a word, a line or a paragraph) or barcode from them.
FlexiLayouts created this way can be used in ABBYY FlexiCapture to detect fields on document images based on their positions relative to static elements. Using training to create FlexiLayouts can be a lot easier than creating them manually.
ABBYY FlexiLayout Studio also lets you create several layout alternatives. Layout alternatives are versions of a FlexiLayout that have different static element settings but share the same set of fields. They can be trained using the training mode or while editing a FlexiLayout.
When training FlexiLayouts, keep the following considerations in mind:
- The program uses images from the training set for training. You can add images to the training set at any point prior to generating a FlexiLayout. The training set must contain at least 3 pages.
- You can only edit the static elements of a layout alternative. All alternatives of a FlexiLayout have the same fields.
- If the page contains unmarked fields or static elements when you edit a layout alternative, the FlexiLayout will be automatically applied to the page. Any new field blocks detected in this way will be added to the FlexiLayout, while existing field regions will not be changed.
- If you add blocks to an existing FlexiLayout, these blocks will not appear in the training mode. You can create these blocks using the Initialize Fields By Blocks command on the shortcut menu of the FlexiLayout.
Creating FlexiLayouts in the training mode
To create a FlexiLayout using training, complete the following steps:
- Create a new project or open an existing project that already contains a FlexiLayout.
- Switch to the training mode by clicking the button or by selecting Training Mode from the Training menu.
- In the Batch pane:
- Pre-recognize the images (select all images, right-click them and click Pre-Recognize on the shortcut menu or open the Batch menu and click Pre-Recognize).
- Double-click any image to open it. Study the recognition results, specify the fields and static elements which you want to use for training the FlexiLayout.
- In the Training window:
- Select a layout alternative from the Reference Alternative drop-down list or create a new one.
- If you want to generate reference elements automatically, leave the Auto references option enabled. If this option is disabled, you will have to specify reference elements
- If the Auto references option is disabled, automatically generated reference elements will be added to the Static elements list, and their positions will be marked on pages. You can remove unnecessary elements or add new ones. If the Auto references option is enabled, manually added elements will be deleted.
- The contents of the Training window depend on whether the Auto references option is enabled or disabled. If it is disabled, the Training window will contain two areas: the Fields area with a list of fields and the Static elements area with a list of static elements. If the Auto references option is enabled, only the Fields area will be present in the Training window.
- Mark the areas of fields and static elements on the page in the Image window:
- Use the Create Block tool to mark area of blocks on the image. The names of the blocks you have marked will appear in the Fields list.
- Use the Create Element tool to mark areas of static elements. The names of marked elements will appear in the
Static elements list
- The program uses recognized text near marked fields and elements to generate their names. These names will appear in bold font when the area of the field or element has been detected on the page.
- The Create Block and Create Element tools can be used when viewing a Reference Layout or Difference Layout, but the Create Element tool is only available when creating reference elements manually.
- You can also use commands on the shortcut menu to mark fields and static elements on the image. Use the Draw Location tool to mark a field or element, right-click the marked area and click the desired command on the shortcut menu.
- After you have finished marking fields and static elements, add the pages to the training set by selecting them in the Used For Training column of the Batch window.
- Click Predict Draft Layout on the shortcut menu of images.
- Check whether fields and elements were detected correctly on all images in the batch and correct the markup where necessary. Add all pages where the program failed to detect fields or elements to the training set.
Elements that are not present on the page can be marked as such. To do this, click Not Present on the shortcut menu of an element, click the button or click the middle mouse button on the area of the element on the image. The names of the elements marked as not present will be displayed in strike-through font.
The status of pages is indicated by icons in the Training Layout State column:
- No elements marked
Indicates that no fields or reference elements have been marked on the page.
- Has unmarked elements
Some fields and/or reference elements have not been marked on the page or not all unmarked elements have been marked as not present.
- Has unmarked required element
A required element has not been marked on the page.
- All elements marked
All elements have been marked on the page or marked as not present.
- No reference class set
The reference class of the training page has not been specified.
- Click the button to generate the layout alternative. If the project uses more than one reference alternative, specify which layout alternatives you want to update.
Training FlexiLayouts during debugging
ABBYY FlexiCapture 12 lets you train FlexiLayouts while debugging them. To do so, use the Train Alternative command on the context menu of a layout alternative. The reference layout will be used to generate the new layout alternative, and reference elements will be created automatically. The Train and Test Alternative command trains a new layout alternative and applies it to all pages of its class.
Training mode options
The Training tab of the Options... dialog contains training settings.
Settings in the Navigation group determine how the program navigates pages when you click the or buttons:
- any unmarked element
Cycles pages that contain any unmarked element. This is the default setting.
- unmarked selected element
Cycles pages that contain the currently selected unmarked element.
The Draft layout prediction group contains settings that determine how layout alternatives are applied during training:
- Predict layout automatically on navigation ( button)
Automatically applies the draft layout when you switch to a different page.
- Replace existing regions of fields on batch prediction
Replaces existing blocks of fields and elements with blocks detected when the layout alternative was applied.
The Template generation group contains one option:
- Create Identifiers on Generation automatically creates identifiers for the layout alternative being trained (see the Identifiers section of this article for details).
Identifiers are distinctive features of documents that can be used to classify it as belonging to a specific type. Examples of identifiers include distinctive words and phrases, specific barcode values and separators.
When you train a layout alternative, the program will put together a set of words that occur frequently in documents belonging to the layout alternative and do not occur in other types of documents.
Identifiers of layout alternatives are stored in the Identifiers group, which is marked as required. This group includes a list of elements of the static text type, each of which contains a word that only occurs in documents belonging to the layout alternative. Relationships between groups of identifiers cannot be created.
|Enables/disables the training mode.|
|Creates a field and marks its area.|
|Creates an element and marks its area.|
|Draws an area.|
|Edits an existing area.|
|Deletes and area.|
|Creates layout alternatives using pages from the training set.|
|Automatically generates fields based on existing areas in the FlexiLayout.|
|Marks the selected element as not present on the page.|
|Marks an element as required.|