Working with Profiles

ABBYY FineReader Engine supports numerous parameters which allow the user to fine-tune the Engine. The user can specify parameters for image preprocessing, analysis, recognition, synthesis, and export to receive the optimal speed and quality of processing. For example, if the application will export recognition results to TXT, then the page layout is not relevant and many layout-related properties can be disabled.

When new objects are created, either directly with the help of the creation methods of the Engine object or indirectly, the properties of newly created objects are usually set to reasonable defaults (for more information about the default value of a property, see the description of the corresponding property). But default values are not always optimal for all usage scenarios. You may need to change these properties in some cases. This can be done either via the API or with the help of a profile. A profile contains a list of new default values for object properties.

Predefined profiles

ABBYY FineReader Engine provides a set of predefined profiles which are designed for the main usage scenarios (the complete specification of all predefined profiles is described in Predefined Profiles Specification). The settings provided in these profiles are most suitable in the corresponding situations. Besides, most of the profiles come in two forms: with the settings optimized for the best quality of the resulting document or with the settings optimized for the highest speed of processing. The description of predefined profiles is presented in the table below:

Profile name Description
DocumentConversion_Accuracy

Suitable for converting documents into an editable format (e.g., RTF, DOCX). The settings have been optimized for accuracy:

  • Best quality. Enables font style detection and full synthesis of the logical structure of a document.
DocumentConversion_Speed

Suitable for converting documents into an editable format (e.g., RTF, DOCX). The settings have been optimized for processing speed:

  • Best quality. Enables font style detection and full synthesis of the logical structure of a document.
  • The processes of document analysis and recognition are speeded up.
DocumentArchiving_Accuracy

Suitable for creating an electronic archive (converting to PDF, PDF/A, PDF and PDF/A with MRC). The settings have been optimized for accuracy:

  • Enables detection of maximum text on an image, including text embedded into the image.
  • Full synthesis of the logical structure of a document is not performed.

Important! The profile is not intended for converting a document into RTF, DOCX, PDF text only. Use the document conversion profiles for such purpose.

DocumentArchiving_Speed

Suitable for creating an electronic archive (converting to PDF, PDF/A, PDF and PDF/A with MRC). The settings have been optimized for processing speed:

  • Enables detection of maximum text on an image, including text embedded into the image.
  • Skew correction is not performed.
  • Full synthesis of the logical structure of a document is not performed.
  • The processes of document analysis and recognition are speeded up.

Important! The profile is not intended for converting a document into RTF, DOCX, PDF text only. Use the document conversion profiles for such purpose.

BookArchiving_Accuracy

Suitable for creating an electronic library (converting to PDF, PDF/A, PDF and PDF/A with MRC). The settings have been optimized for accuracy:

  • Best quality. Enables font style detection and full synthesis of the logical structure of a document.
BookArchiving_Speed

Suitable for creating an electronic library (converting to PDF, PDF/A, PDF and PDF/A with MRC). The settings have been optimized for processing speed:

  • Best quality. Enables font style detection and full synthesis of the logical structure of a document.
  • The processes of document analysis and recognition are speeded up.
TextExtraction_Accuracy

Suitable for extracting text from a document. The settings have been optimized for accuracy:

  • Enables detection of all text on an image, including small text areas of low quality (pictures and tables are not detected).
  • Full synthesis of the logical structure of a document is not performed.

Important! The profile is not intended for converting a document into RTF, DOCX, PDF text only. Use the document conversion profiles for such purpose.

TextExtraction_Speed

Suitable for extracting text from a document. The settings have been optimized for processing speed:

  • Enables detection of all text on an image, including small text areas of low quality (pictures and tables are not detected).
  • Full synthesis of the logical structure of a document is not performed.
  • The processes of document analysis and recognition are speeded up.

Important! The profile is not intended for converting a document into RTF, DOCX, PDF text only. Use the document conversion profiles for such purpose.

FieldLevelRecognition Suitable for recognizing short text fragments. Currently this profile has default settings.
BarcodeRecognition_Accuracy

Suitable for barcode extraction. Extracts only barcodes (texts, pictures, or tables are not detected). The settings have been optimized for accuracy.

For purposes of compatibility, you can also access this profile under the BarcodeRecognition name.

Important! This profile requires the Barcode Autolocation module available in the license.

BarcodeRecognition_Speed

Suitable for barcode extraction. Extracts only barcodes (texts, pictures, or tables are not detected). The settings have been optimized for processing speed.

Important! This profile requires the Barcode Autolocation module available in the license.

HighCompressedImageOnlyPdf

Suitable for creating high-compressed PDF files which contain entire documents saved as pictures. The following settings are used:

  • Document recognition and synthesis of the logical structure of a document are not performed.
  • Skew correction is not performed.
  • PDF export is optimized for the minimum size of the resulting file.
  • The entire document is saved as a picture (PEM_ImageOnly mode).
BusinessCardsProcessing

Suitable for recognizing business cards. The following settings are used:

  • Detects only business cards.
  • Enables detection of all text on an image, including small text areas of low quality (pictures and tables are not detected).
  • Resolution correction is performed.
  • Full synthesis of the logical structure of a document is not performed.
MachineReadableZone

Suitable for extracting data from a machine-readable zone (MRZ). The following settings are used:

  • Enables detection and extraction of all text on an image (pictures, vector graphics and tables are not detected).
  • Resolution and geometry correction are performed automatically.
EngineeringDrawingsProcessing

Suitable for recognizing technical drawings. It takes into account large size and complexity of engineering diagrams, as well as possibility of different text orientation within the image. The profile is intended for converting such images into searchable PDF format. The following settings are used:

  • Enables detection of all text on an image, including text blocks of vertical orientation.
  • Full synthesis of the logical structure of a document is not performed.

Important! The profile is not intended for converting a document into RTF, DOCX, PDF text only. Use the document conversion profiles for such purpose.

Version9Compatibility Provided for compatibility, sets the processing parameters to the default values of ABBYY FineReader Engine 9.0.
Default Sets all the processing parameters to the default values.

The settings provided with these profiles can be loaded using the LoadPredefinedProfile method of the Engine object. After the profile is loaded, newly created objects will have the new default values specified in the profile.

Notes:

  • The predefined profile files can be found in your distribution package in the Bin/PredefinedProfiles folder.
  • To determine the set of resource files necessary for your application to function with the help of the FREngineDistribution.csv file, consult the page corresponding to the scenario you have chosen.
  • For HighCompressedImageOnlyPdf, EngineeringDrawingsProcessing, Version9Compatibility profiles, select in the column 5 (RequiredByModule) the following values:

Core

Core.Resources

Opening

Opening, Processing

Processing

Processing.OCR

Processing.OCR, Processing.ICR

Processing.OCR.NaturalLanguages

Processing.OCR.NaturalLanguages, Processing.ICR.NaturalLanguages

Export

Export, Processing

Export.Pdf

Export.Pdf, Opening.Pdf

You also need to specify the interface languages, recognition languages and any additional features which your application uses (such as, e.g., Opening.PDF if you need to open PDF files, or Processing.OCR.CJK if you need to recognize texts in CJK languages). See Working with the FREngineDistribution.csv File for further details.

User profiles

You can also create your own profile file. The syntax of a profile file is similar to that of *.ini files. Comments can be added by starting a line with a semicolon.

The sections contain the names of the objects whose properties are to be re-specified, and the keys contain the properties with their new values. The special section called UserData can contain any user-defined keys. The values of Boolean properties are represented by the strings "true" or "false," while enumeration properties are represented by corresponding constants, for example:

[PrepareImageMode]
DiscardColorImage = true
[PDFExportParams]
TextExportMode = PEM_ImageOnText
; this is a comment
[RecognizerParams]
TextLanguage = English,Russian
    

The LoadProfile method of the Engine object allows you to load a user profile file. After this file is loaded, newly created objects will have the new default values specified in the file. Loading parameters from a profile is similar to specifying the corresponding properties in the program code, but it simplifies the logic and data in the application. If an empty string is passed to IEngine::LoadProfile, the standard default values will be used.

The correctness of the new values of the properties and their conformity to the license are checked when the corresponding object is created.

A profile file can be used to re-specify all the properties of the following objects:

1 To set the properties of the PictureExportParams or PaperSizeParams objects, specify the parameters directly in the section of the export parameter object (not in the PictureExportParams or PaperSizeParams section). This will allow you to use different settings for different export formats. For example, to specify gray picture format for RTF files:

[RTFExportParams]
GrayPictureFormats = GPF_Png
    

2 To set the properties of the DocumentContentInfoWritingParams object, specify the parameters directly in the section of its parent object. For PDF format, it is PDFExportFeatures object; for other formats, it is corresponding export parameter object. Thus you can specify different content info settings for different export formats. For example, if you do not want to write document author into output PDF files, insert the following lines into profile:

[PDFExportFeatures]
WriteAuthor = false
    

3 To set the properties of the PageMargins object, specify the parameters directly in the section of its parent object. Note that the UseCustomPageMargins property set to TRUE must be specified before the values of page margins, as described in the sample below:

[RTFExportParams]
UseCustomPageMargins = true
PageMargins.Left = 5000
PageMargins.Right = 5000
PageMargins.Top = 5000
PageMargins.Bottom = 5000

    

Using both predefined and user profiles

One predefined profile and one user profile can be loaded simultaneously. User profile has priority over predefined profile, i.e., if a user profile sets the same parameter as a predefined profile, the value from the user profile will be used.

If you load one more predefined profile, this new profile replaces the previous predefined profile. Similarly, a new user profile replaces the previous user profile. Note that loading of a profile cleans the current recognition session (i.e., the IEngine::CleanRecognizerSession method is called automatically).

See also

Tuning Parameters of Preprocessing, Analysis, Recognition, and Synthesis

Tuning Export Parameters

03.07.2024 8:50:25

Usage of Cookies. In order to optimize the website functionality and improve your online experience ABBYY uses cookies. You agree to the usage of cookies when you continue using this site. Further details can be found in our Privacy Notice.