Pre-recognition

Pre-recognition is the first stage of processing a semi-structured document. Unlike fixed documents, which are designed with computer processing in mind, unstructured documents have different structure and data fields are placed in different parts of page. For this reason, pre-recognition is used to detect objects on the document which could signal the location of data fields.

Pre-recognition is the first stage of document analysis. Since pre-recognition may take considerable time, FlexiLayout Studio allows you to carry out pre-recognition once independently of FlexiLayout matching, so that you can concentrate entirely on creating and testing your FlexiLayout.

However, you need to assess the quality of the pre-recognition results before you start creating your FlexiLayout. The quality of pre-recognition depends on the quality of the test images in the batch. The quality of the test images, in its turn, depends on the scanning parameters such as brightness, contrast, and resolution. If you are not satisfied with the quality of the pre-recognition results, you may need to change the scanning options and re-scan your test documents. Note also that FlexiLayout Studio allows you to add images scanned at different resolutions, so that you can experiment with pre-recognition and FlexiLayout matching and select the optimal scanning options.

Pre-recognition can be run in fast or full mode (see Pre-recognition parameters for details). When FlexiLayout is being developed, pre-recognition need not be perfect. There is always a way of finding practically any data field even if several recognition errors have been made. Indeed, sometimes pre-recognition speed is more important than quality - the quality of recognition may be tackled at a later stage in a data capture application where you can specify data types for each field, thereby greatly improving the quality of recognition.

During pre-recognition, the program analyzes the locations of dots of various colors, detects basic objects, and merges text fragments into words and lines.

The program detects the following types of basic object:

Text
Picture
Punctuation mark
Inverted text
Separator
Barcode
Checkmark

Once basic objects have been detected, the program starts recognizing the text objects. Recognized text can be viewed of the following two types:

Recognized Words
Recognized Lines

Pre-recognition parameters

Running pre-recognition and viewing the results

Analyzing images

12.04.2024 18:16:02

Please leave your feedback about this article

Name

E-mail

Comment

Your use of this site is conditioned on Your continued compliance with the Terms of Use.

Terms of Use

Disclaimer of Warranty

Limitation of Liability

Transmission and Submission of Information

Downloads

Use of Content

Trademarks

Links to Third-Party Sites

Foreign Legislation

Subscription Terms

Partner Subscription Terms

Pre-recognition

More:

Please leave your feedback about this article