RecognizerParams Object (IRecognizerParams Interface)

This object allows you to tune the recognition parameters. Each text block and table cell in layout has its own child object of the RecognizerParams type. Besides, this object is passed as a subobject of the PageProcessingParams object into ABBYY FineReader Engine layout analysis-recognition functions. Recognition functions use parameters of recognition defined by text blocks' and table cells' child objects of the type RecognizerParams.

Whenever a text block or table cell is created during layout analysis, properties of its child object of the RecognizerParams type are initialized with values of properties of the RecognizedParams object that was passed to analysis function. Properties of a subobject of the block which is created using the AddNew method of the LayoutBlocks object are set to reasonable defaults. To know about the default value of this or that property see its description.

The RecognizerParams object is a persistent object. This means that the object's current state can be written to persistent storage on disk. Later, the object can be re-created by reading the object's state from persistent storage. The following methods provide persistence of the object: SaveToFile, and LoadFromFile.

Properties

Name Type Description
Application Engine, read-only Returns the Engine object.
Main settings
TextLanguage TextLanguage

Specifies the language to be used for recognition. This property can be easily set via the SetPredefinedTextLanguage method.

Note: The property returns a constant object. To change the recognition language, you must at first obtain an intermediate TextLanguage object using an appropriate creation method, change the necessary parameters, and then assign the obtained object to the property.

By default, this parameter is initialized with English language.

LanguageDetectionMode ThreeStatePropertyValueEnum

Manages automatic language detection.

When language autodetection is on, the recognition language is detected for each word in the text. It is selected from the list of languages specified in the TextLanguage property. Autodetection is intended to be used during recognition of documents the language of which is not known to you.

Important! Language autodetection deals only with the predefined languages (see the full list in Predefined Languages in ABBYY FineReader Engine).

If you know for certain that all the languages you specified are present in the document, autodetection is useless. Turn it off by setting this property to TSPV_No.

You can view the list of languages detected in the recognized document or recognized page using the DetectedLanguages property of the FRDocument or FRPage object.

By default, this property value is TSPV_Auto.

TextTypes int

The value of this property is an OR superposition of the TextTypeEnum enumeration constants which denote possible text types used for recognition. For example, if it is set to TT_Normal | TT_Index, ABBYY FineReader Engine will presume that the text contains only common typographic text and digits written in ZIP-code style, ignoring all other variants. See also Using Text Type Autodetection.

Notes:

  • If this property is equal to any combination of TT_Matrix, TT_Typewriter, TT_OCR_A, and TT_OCR_B, italic fonts and superscript/subscript will not be recognized, regardless of the values of the ProhibitItalic, ProhibitSubscript and ProhibitSuperscript properties.
  • If this property is TT_Handprinted, the CorrectOrientation property of the PagePreprocessingParams object cannot be set to TRUE.

By default, this property is set to TT_Normal.

DetectTextTypesIndependently VARIANT_BOOL

Indicates that text type should be determined for each text block separately. This setting is useful when you have comparatively small text blocks with different text types, although it may slightly slow down processing.

The default value of this property is FALSE.

Recognition speed
BalancedMode VARIANT_BOOL These properties are deprecated and will be deleted in future versions. Use the Mode property instead (RM_Normal corresponds to BalancedMode = TRUE and RM_Fast corresponds to FastMode = TRUE).
FastMode VARIANT_BOOL
Mode RecognitionModeEnum

Specifies the recognition mode to be applied to the target documents. Each mode provides its own level of recognition speed and accuracy to get satisfying results on images and documents with various peculiarities.

By default, this property is set to RM_Normal.

Important! Built-in patterns are always used for the accurate mode. To disable using the built-in patterns, switch to the normal mode (RM_Normal).

Fine tuning
LowResolutionMode VARIANT_BOOL

Specifies whether a text on an image with low resolution is recognized. This property is useful when recognizing faxes, small prints, images with low resolution or bad print quality.

By default, the value of this property is FALSE.

OneLinePerBlock VARIANT_BOOL

This property set to TRUE tells ABBYY FineReader Engine to presume that the text in block to which the current RecognizerParams object belongs contains no more than one string.

By default, this property is FALSE.

OneWordPerLine VARIANT_BOOL

This property set to TRUE tells ABBYY FineReader Engine to presume that no text line may contain more than one word, so the lines of text will be recognized as a single word.

By default, this property is FALSE.

ProhibitItalic VARIANT_BOOL

This property set to TRUE tells ABBYY FineReader Engine not to recognize letters printed with italic-styled font. It is useful when a text with presumably no italic letters is recognized, in which case it may speed up the recognition. If there exist any italic letters on the image, and this property is TRUE, these letters will be recognized incorrectly.

By default, this property is FALSE.

ProhibitSubscript VARIANT_BOOL

This property set to TRUE tells ABBYY FineReader Engine not to recognize subscript letters. It is useful when a text with presumably no subscripts is recognized, in which case it may speed up the recognition. If there exist any subscript letters on the image, and this property is TRUE, these letters will be recognized incorrectly.

By default, this property is FALSE.

ProhibitSuperscript VARIANT_BOOL

This property set to TRUE tells ABBYY FineReader Engine not to recognize superscript letters. It is useful when a text with presumably no superscripts is recognized, in which case it may speed up the recognition. If there exist any superscript letters on the image, and this property is TRUE, these letters will be recognized incorrectly.

By default, this property is FALSE.

ProhibitHyphenation VARIANT_BOOL

This property set to TRUE prohibits recognition of hyphenation from line to line. It is useful when a text with presumably no hyphenations is recognized, in which case it may speed up the recognition. If there exist any hyphenations in the recognized block, and this property is TRUE, the hyphenated words will be recognized incorrectly.

By default, this property is FALSE.

ProhibitSmallCaps VARIANT_BOOL

This property set to TRUE tells ABBYY FineReader Engine not to recognize small capitals.

By default, this property is FALSE.

ProhibitInterblockHyphenation VARIANT_BOOL

This property set to TRUE tells ABBYY FineReader Engine to presume that text from one block cannot be carried over to the next block.

By default, this property is FALSE.

CaseRecognitionMode CaseRecognitionModeEnum

This property specifies the mode of letter case recognition.

By default, the value of this property is CRM_AutoCase, which corresponds to automatic case recognition.

WritingStyle WritingStyleEnum

Provides additional information about handprinted letters writing style.

By default, the value of this property is WS_Auto, which means that the writing style is automatically detected by FineReader Engine.

FieldMarkingType FieldMarkingTypeEnum

This property specifies the type of marking around letters (for example, underline, frame, box, etc.).

Note: For marking types where each letter is in a separate cell please use CellsCount property to set the number of character cells for a recognized block.

By default, the value of this property is FMT_SimpleText, which means no marking.

CellsCount int

Specifies the number of character cells in the block.

It makes sense only for the field marking types (the FieldMarkingType property) in which every letter is written in a separate cell.

Default value for this property is 1, but you should set the appropriate value to recognize the text correctly.

User patterns
UseBuiltInPatterns VARIANT_BOOL

This property set to TRUE means that ABBYY FineReader Engine will use its own built-in patterns for recognition. Patterns are files establishing relationship between character image and character itself. You may want to set this property to FALSE when you do not want to use standard ABBYY FineReader Engine patterns for character recognition, but user patterns only. This may be useful for recognition of text typed with decorative or non-standard fonts. In this case, it is better not to use ABBYY FineReader Engine built-in patterns but use your own user-defined patterns trained for these fonts.

A path to user-defined pattern file is stored in the UserPatternsFile property. If the UserPatternsFile property is empty the UseBuiltInPatterns property is ignored.

By default, this property is TRUE.

Important! You may set this property to FALSE in case of using the normal and fast recognition modes. You cannot prohibit using the built-in patterns for the accurate mode (see description of the Mode property).

UserPatternsFile BSTR

Contains the full path to a file of the user pattern used for recognition. If the value of this property is not empty, information from the user pattern file will be used during recognition.

If the UseBuiltInPatterns property is FALSE, which means that standard ABBYY FineReader Engine patterns are not used during recognition, this property should contain a path to user-defined pattern file, as only information stored in it will be used.

By default, this property stores an empty string.

Additional recognition information
HighlightSuspiciousCharacters VARIANT_BOOL

Specifies if uncertainly recognized characters should be have the IsSuspicious property set to TRUE.

The name of the property reflects the fact that ABBYY FineReader highlights suspicious characters in text with background color, which makes manual verification easier for the operator.

By default, this property is TRUE.

ExactConfidenceCalculation VARIANT_BOOL

If this property is TRUE, character and word confidence will be defined more accurately, but recognition speed may get slower.

Note: The value of character confidence is stored in the CharConfidence property of the CharacterRecognitionVariant and PlainText objects. The value of word confidence is stored in the WordConfidence property of the WordRecognitionVariant object.

This property is automatically set to TRUE if the SaveCharacterRecognitionVariants or SaveWordRecognitionVariants property is TRUE.

By default, this property is FALSE.

SaveCharacterRegions VARIANT_BOOL

Specifies whether the exact characters regions (ICharParams::CharacterRegion) are saved.

The default value is FALSE.

SaveCharacterRecognitionVariants VARIANT_BOOL

Specifies whether the variants of characters recognition are saved.

Note: The ICharParams::CharacterRecognitionVariants property returns a collection of recognition variants for a character. See also Using Voting API.

The default value is FALSE.

SaveWordRecognitionVariants VARIANT_BOOL

Specifies whether the variants of recognition of a word are saved.

Note: The IParagraph::GetWordRecognitionVariants method and ICharParams::WordRecognitionVariants property return a collection of recognition variants for a word. See also Using Voting API.

The default value is FALSE.

Methods

Name Description
CopyFrom Initializes properties of the current object with values of similar properties of another object.
LoadFromFile Restores the object contents from a file on disk.
SaveToFile Saves the object contents into a file on disk.
SetPredefinedTextLanguage Sets the language of recognition to be one of the predefined ABBYY FineReader Engine languages.

Related objects

Object Diagram

Output parameter

This object is the output parameter of the CreateRecognizerParams method of the Engine object.

Input parameter

This object is passed as an input parameter to the following methods:

Samples

The object is used in the following code samples: CustomLanguage, CommandLineInterface.

See also

Tuning Parameters of Preprocessing, Analysis, Recognition, and Synthesis

Recognizing Handprinted Texts

PageProcessingParams

TextBlock

Working with Properties

03.07.2024 8:50:25

Usage of Cookies. In order to optimize the website functionality and improve your online experience ABBYY uses cookies. You agree to the usage of cookies when you continue using this site. Further details can be found in our Privacy Notice.