Working with Profiles
ABBYY FineReader Engine supports numerous parameters which allow the user to fine-tune the Engine. The user can specify parameters for image preprocessing, analysis, recognition, synthesis, and export to receive the optimal speed and quality of processing. For example, if the application will export recognition results to TXT, then the page layout is not relevant and many layout-related properties can be disabled.
When new objects are created, either directly with the help of the creation methods of the Engine object or indirectly, the properties of newly created objects are usually set to reasonable defaults (for more information about the default value of a property, see the description of the corresponding property). But default values are not always optimal for all usage scenarios. You may need to change these properties in some cases. This can be done either via the API or with the help of a profile. A profile contains a list of new default values for object properties.
Predefined profiles
ABBYY FineReader Engine provides a set of predefined profiles which are designed for the main usage scenarios (the complete specification of all predefined profiles is described in Predefined Profiles Specification). The settings provided in these profiles are most suitable in the corresponding situations. Besides, most of the profiles come in two forms: with the settings optimized for the best quality of the resulting document or with the settings optimized for the highest speed of processing. The description of predefined profiles is presented in the table below:
Profile name | Description |
---|---|
DocumentConversion_Accuracy |
Suitable for converting documents into an editable format (e.g., RTF, DOCX). The settings have been optimized for accuracy:
|
DocumentConversion_Speed |
Suitable for converting documents into an editable format (e.g., RTF, DOCX). The settings have been optimized for processing speed:
|
DocumentArchiving_Accuracy |
Suitable for creating an electronic archive (converting to PDF, PDF/A, PDF and PDF/A with MRC). The settings have been optimized for accuracy:
Important! The profile is not intended for converting a document into RTF, DOCX, PDF text only. Use the document conversion profiles for such purpose. |
DocumentArchiving_Speed |
Suitable for creating an electronic archive (converting to PDF, PDF/A, PDF and PDF/A with MRC). The settings have been optimized for processing speed:
Important! The profile is not intended for converting a document into RTF, DOCX, PDF text only. Use the document conversion profiles for such purpose. |
BookArchiving_Accuracy |
Suitable for creating an electronic library (converting to PDF, PDF/A, PDF and PDF/A with MRC). The settings have been optimized for accuracy:
|
BookArchiving_Speed |
Suitable for creating an electronic library (converting to PDF, PDF/A, PDF and PDF/A with MRC). The settings have been optimized for processing speed:
|
TextExtraction_Accuracy |
Suitable for extracting text from a document. The settings have been optimized for accuracy:
Important! The profile is not intended for converting a document into RTF, DOCX, PDF text only. Use the document conversion profiles for such purpose. |
TextExtraction_Speed |
Suitable for extracting text from a document. The settings have been optimized for processing speed:
Important! The profile is not intended for converting a document into RTF, DOCX, PDF text only. Use the document conversion profiles for such purpose. |
FieldLevelRecognition | Suitable for recognizing short text fragments. Currently this profile has default settings. |
BarcodeRecognition_Accuracy |
Suitable for barcode extraction. Extracts only barcodes (texts, pictures, or tables are not detected). The settings have been optimized for accuracy. For purposes of compatibility, you can also access this profile under the BarcodeRecognition name. Important! This profile requires the Barcode Autolocation module available in the license. |
BarcodeRecognition_Speed |
Suitable for barcode extraction. Extracts only barcodes (texts, pictures, or tables are not detected). The settings have been optimized for processing speed. Important! This profile requires the Barcode Autolocation module available in the license. |
HighCompressedImageOnlyPdf |
Suitable for creating high-compressed PDF files which contain entire documents saved as pictures. The following settings are used:
|
BusinessCardsProcessing |
Suitable for recognizing business cards. The following settings are used:
|
MachineReadableZone |
Suitable for extracting data from a machine-readable zone (MRZ). The following settings are used:
|
EngineeringDrawingsProcessing |
Suitable for recognizing technical drawings. It takes into account large size and complexity of engineering diagrams, as well as possibility of different text orientation within the image. The profile is intended for converting such images into searchable PDF format. The following settings are used:
Important! The profile is not intended for converting a document into RTF, DOCX, PDF text only. Use the document conversion profiles for such purpose. |
Version9Compatibility | Provided for compatibility, sets the processing parameters to the default values of ABBYY FineReader Engine 9.0. |
Default | Sets all the processing parameters to the default values. |
The settings provided with these profiles can be loaded using the LoadPredefinedProfile method of the Engine object. After the profile is loaded, newly created objects will have the new default values specified in the profile.
Notes:
- To determine the set of resource files necessary for your application to function with the help of the FREngineDistribution.csv file, consult the page corresponding to the scenario you have chosen.
- For HighCompressedImageOnlyPdf, EngineeringDrawingsProcessing, Version9Compatibility profiles, select in the column 5 (RequiredByModule) the following values:
Core
Core.Resources
Opening
Opening, Processing
Processing
Processing.OCR
Processing.OCR, Processing.ICR
Processing.OCR.NaturalLanguages
Processing.OCR.NaturalLanguages, Processing.ICR.NaturalLanguages
Export
Export, Processing
Export.Pdf
Export.Pdf, Opening.Pdf
You also need to specify the interface languages, recognition languages and any additional features which your application uses (such as, e.g., Opening.PDF if you need to open PDF files, or Processing.OCR.CJK if you need to recognize texts in CJK languages). See Working with the FREngineDistribution.csv File for further details.
User profiles
You can also create your own profile file. The syntax of a profile file is similar to that of *.ini files. Comments can be added by starting a line with a semicolon.
The sections contain the names of the objects whose properties are to be re-specified, and the keys contain the properties with their new values. The special section called UserData can contain any user-defined keys. The values of Boolean properties are represented by the strings "true" or "false," while enumeration properties are represented by corresponding constants, for example:
[PrepareImageMode] DiscardColorImage = true [PDFExportParams] TextExportMode = PEM_ImageOnText ; this is a comment [RecognizerParams] TextLanguage = English,Russian
The LoadProfile method of the Engine object allows you to load a user profile file. After this file is loaded, newly created objects will have the new default values specified in the file. Loading parameters from a profile is similar to specifying the corresponding properties in the program code, but it simplifies the logic and data in the application. If an empty string is passed to IEngine::LoadProfile, the standard default values will be used.
The correctness of the new values of the properties and their conformity to the license are checked when the corresponding object is created.
A profile file can be used to re-specify all the properties of the following objects:
1 To set the properties of the PictureExportParams or PaperSizeParams objects, specify the parameters directly in the section of the export parameter object (not in the PictureExportParams or PaperSizeParams section). This will allow you to use different settings for different export formats. For example, to specify gray picture format for RTF files:
[RTFExportParams] GrayPictureFormats = GPF_Png
2 To set the properties of the DocumentContentInfoWritingParams object, specify the parameters directly in the section of its parent object. For PDF format, it is PDFExportFeatures object; for other formats, it is corresponding export parameter object. Thus you can specify different content info settings for different export formats. For example, if you do not want to write document author into output PDF files, insert the following lines into profile:
[PDFExportFeatures]
WriteAuthor = false
3 To set the properties of the PageMargins object, specify the parameters directly in the section of its parent object. Note that the UseCustomPageMargins property set to TRUE must be specified before the values of page margins, as described in the sample below:
[RTFExportParams]
UseCustomPageMargins = true
PageMargins.Left = 5000
PageMargins.Right = 5000
PageMargins.Top = 5000
PageMargins.Bottom = 5000
Using both predefined and user profiles
One predefined profile and one user profile can be loaded simultaneously. User profile has priority over predefined profile, i.e., if a user profile sets the same parameter as a predefined profile, the value from the user profile will be used.
If you load one more predefined profile, this new profile replaces the previous predefined profile. Similarly, a new user profile replaces the previous user profile. Note that loading of a profile cleans the current recognition session (i.e., the IEngine::CleanRecognizerSession method is called automatically).
See also
Tuning Parameters of Preprocessing, Analysis, Recognition, and Synthesis
17.09.2024 15:14:40