What types of documents can be processed with ABBYY FlexiCapture
ABBYY FlexiCapture is software solution for single-flow data capture from documents of various types.
Various documents can be processed in a single flow. You can also set up the program to process documents of mixed type, e.g. documents that contain both structured and semi-structured sections. The type of documents affects system configuration only, namely the method of creating Document Definitions. The nature of the Operator's work is not affected by the type of documents processed.
ABBYY FlexiCapture can be configured for automated input of the following documents, including in a single flow:
Structured fixed forms;
Structured forms are documents containing a set of marked information fields whose formatting, number and layout does not change from one document instance to the next. Such documents are called fixed forms. For example, most questionnaires and application forms are fixed forms. Such forms are usually distributed as blank forms and filled out by hand.
To identify a fixed form in a document flow and to extract data from it, you need to create a single fixed layout that will tell the program the location of the fields containing data to be extracted.
Fixed forms created to meet the requirements of automatic input can be processed most effectively. Such forms are called machine-readable forms. To learn more about the requirements for such forms and the methods of creating them, see the section Creating a machine-readable forms.
The program includes a convenient tool for designing machine-readable forms ABBYY FormDesigner (supplied together with ABBYY FlexiCapture). You can read about designing forms with the help of ABBYY FormDesigner in the User Guide and help topics.
The basic stages of creating a Document Definition are described specifically for structured documents.
Note: Fixed forms received by fax can be distorted: their size and the relative positions of their fields may be altered. Due to this, we recommend using a FlexiLayout to increase recognition quality when processing such forms.
Semi-structured flexible forms and documents;
These are documents containing a set of information fields whose design, number and layout may vary significantly in different instances of the document. These documents are called flexible. For example, bills are semi-structured documents, since they often vary both in terms of the number of items and their formatting, due to the fact that they are received from different companies. All bills have a bill number and an amount due for payment, but are located in different places.
To identify flexible forms and to extract data from them, ABBYY FlexiCapture uses a flexible layout (FlexiLayout). A flexible layout is created with the help of a special module called ABBYY FlexiLayout Studio. Details of this module are available in the User Guide and help topics.
The processing of semi-structured documents differs from the processing of fixed forms only at the stage of creating and loading a layout. For details see Creating a Document Definition for semi-structured document processing.
Unstructured documents with free-style design.
ABBYY FlexiCapture can be used to process unstructured documents containing information presented in a free style, for example contracts, letters, orders, and graphs. The program can automatically identify unstructured documents as annexes to fixed or flexible forms, or it can identify them with the help of a flexible layout and then export them as PDF searchable files or as graphic files. You can extract index fields from unstructured documents both automatically with the help of a flexible layout and by way of manual input.
NLP can be used to process unstructured documents. This technology uses NLP models to extract information from text.
A typical scenario for the processing of unstructured documents is when a hardcopy archive needs to be converted into electronic form and there is a requirement to extract two or three index fields in order to organize a quick attribute-based search.