RecognizerParams Object (IRecognizerParams Interface)
This object allows you to tune the recognition parameters. Each text block and table cell in layout has its own child object of the RecognizerParams type. Besides, this object is passed as a subobject of the PageProcessingParams object into ABBYY FineReader Engine layout analysis-recognition functions. Recognition functions use parameters of recognition defined by text blocks' and table cells' child objects of the type RecognizerParams.
Whenever a text block or table cell is created during layout analysis, properties of its child object of the RecognizerParams type are initialized with values of properties of the RecognizedParams object that was passed to analysis function. Properties of a subobject of the block which is created using the AddNew method of the LayoutBlocks object are set to reasonable defaults. To know about the default value of this or that property see its description.
The RecognizerParams object is a persistent object. This means that the object's current state can be written to persistent storage on disk. Later, the object can be re-created by reading the object's state from persistent storage. The following methods provide persistence of the object: SaveToFile, and LoadFromFile.
Properties
Name | Type | Description |
---|---|---|
Application | Engine, read-only | Returns the Engine object. |
Main settings | ||
TextLanguage | TextLanguage |
Specifies the language to be used for recognition. This property can be easily set via the SetPredefinedTextLanguage method. Note: The property returns a constant object. To change the recognition language, you must at first obtain an intermediate TextLanguage object using an appropriate creation method, change the necessary parameters, and then assign the obtained object to the property. By default, this parameter is initialized with English language. |
LanguageDetectionMode | ThreeStatePropertyValueEnum |
Manages automatic language detection. When language autodetection is on, the recognition language is detected for each word in the text. It is selected from the list of languages specified in the TextLanguage property. Autodetection is intended to be used during recognition of documents the language of which is not known to you. Important! Language autodetection deals only with the predefined languages (see the full list in Predefined Languages in ABBYY FineReader Engine). If you know for certain that all the languages you specified are present in the document, autodetection is useless. Turn it off by setting this property to TSPV_No. You can view the list of languages detected in the recognized document or recognized page using the DetectedLanguages property of the FRDocument or FRPage object. By default, this property value is TSPV_Auto. |
TextTypes | int |
The value of this property is an OR superposition of the TextTypeEnum enumeration constants which denote possible text types used for recognition. For example, if it is set to TT_Normal | TT_Index, ABBYY FineReader Engine will presume that the text contains only common typographic text and digits written in ZIP-code style, ignoring all other variants. See also Using Text Type Autodetection. Notes:
By default, this property is set to TT_Normal. |
DetectTextTypesIndependently | VARIANT_BOOL |
Indicates that text type should be determined for each text block separately. This setting is useful when you have comparatively small text blocks with different text types, although it may slightly slow down processing. The default value of this property is FALSE. |
Recognition speed | ||
BalancedMode | VARIANT_BOOL | These properties are deprecated and will be deleted in future versions. Use the Mode property instead (RM_Normal corresponds to BalancedMode = TRUE and RM_Fast corresponds to FastMode = TRUE). |
FastMode | VARIANT_BOOL | |
Mode | RecognitionModeEnum |
Specifies the recognition mode to be applied to the target documents. Each mode provides its own level of recognition speed and accuracy to get satisfying results on images and documents with various peculiarities. By default, this property is set to RM_Normal. Important! Built-in patterns are always used for the accurate mode. To disable using the built-in patterns, switch to the normal mode (RM_Normal). |
Fine tuning | ||
LowResolutionMode | VARIANT_BOOL |
Specifies whether a text on an image with low resolution is recognized. This property is useful when recognizing faxes, small prints, images with low resolution or bad print quality. By default, the value of this property is FALSE. |
OneLinePerBlock | VARIANT_BOOL |
This property set to TRUE tells ABBYY FineReader Engine to presume that the text in block to which the current RecognizerParams object belongs contains no more than one string. By default, this property is FALSE. |
OneWordPerLine | VARIANT_BOOL |
This property set to TRUE tells ABBYY FineReader Engine to presume that no text line may contain more than one word, so the lines of text will be recognized as a single word. By default, this property is FALSE. |
ProhibitItalic | VARIANT_BOOL |
This property set to TRUE tells ABBYY FineReader Engine not to recognize letters printed with italic-styled font. It is useful when a text with presumably no italic letters is recognized, in which case it may speed up the recognition. If there exist any italic letters on the image, and this property is TRUE, these letters will be recognized incorrectly. By default, this property is FALSE. |
ProhibitSubscript | VARIANT_BOOL |
This property set to TRUE tells ABBYY FineReader Engine not to recognize subscript letters. It is useful when a text with presumably no subscripts is recognized, in which case it may speed up the recognition. If there exist any subscript letters on the image, and this property is TRUE, these letters will be recognized incorrectly. By default, this property is FALSE. |
ProhibitSuperscript | VARIANT_BOOL |
This property set to TRUE tells ABBYY FineReader Engine not to recognize superscript letters. It is useful when a text with presumably no superscripts is recognized, in which case it may speed up the recognition. If there exist any superscript letters on the image, and this property is TRUE, these letters will be recognized incorrectly. By default, this property is FALSE. |
ProhibitHyphenation | VARIANT_BOOL |
This property set to TRUE prohibits recognition of hyphenation from line to line. It is useful when a text with presumably no hyphenations is recognized, in which case it may speed up the recognition. If there exist any hyphenations in the recognized block, and this property is TRUE, the hyphenated words will be recognized incorrectly. By default, this property is FALSE. |
ProhibitSmallCaps | VARIANT_BOOL |
This property set to TRUE tells ABBYY FineReader Engine not to recognize small capitals. By default, this property is FALSE. |
ProhibitInterblockHyphenation | VARIANT_BOOL |
This property set to TRUE tells ABBYY FineReader Engine to presume that text from one block cannot be carried over to the next block. By default, this property is FALSE. |
CaseRecognitionMode | CaseRecognitionModeEnum |
This property specifies the mode of letter case recognition. By default, the value of this property is CRM_AutoCase, which corresponds to automatic case recognition. |
WritingStyle | WritingStyleEnum |
Provides additional information about handprinted letters writing style. By default, the value of this property is WS_Auto, which means that the writing style is automatically detected by FineReader Engine. |
FieldMarkingType | FieldMarkingTypeEnum |
This property specifies the type of marking around letters (for example, underline, frame, box, etc.). Note: For marking types where each letter is in a separate cell please use CellsCount property to set the number of character cells for a recognized block. By default, the value of this property is FMT_SimpleText, which means no marking. |
CellsCount | int |
Specifies the number of character cells in the block. It makes sense only for the field marking types (the FieldMarkingType property) in which every letter is written in a separate cell. Default value for this property is 1, but you should set the appropriate value to recognize the text correctly. |
User patterns | ||
UseBuiltInPatterns | VARIANT_BOOL |
This property set to TRUE means that ABBYY FineReader Engine will use its own built-in patterns for recognition. Patterns are files establishing relationship between character image and character itself. You may want to set this property to FALSE when you do not want to use standard ABBYY FineReader Engine patterns for character recognition, but user patterns only. This may be useful for recognition of text typed with decorative or non-standard fonts. In this case, it is better not to use ABBYY FineReader Engine built-in patterns but use your own user-defined patterns trained for these fonts. A path to user-defined pattern file is stored in the UserPatternsFile property. If the UserPatternsFile property is empty the UseBuiltInPatterns property is ignored. By default, this property is TRUE. Important! You may set this property to FALSE in case of using the normal and fast recognition modes. You cannot prohibit using the built-in patterns for the accurate mode (see description of the Mode property). |
UserPatternsFile | BSTR |
Contains the full path to a file of the user pattern used for recognition. If the value of this property is not empty, information from the user pattern file will be used during recognition. If the UseBuiltInPatterns property is FALSE, which means that standard ABBYY FineReader Engine patterns are not used during recognition, this property should contain a path to user-defined pattern file, as only information stored in it will be used. By default, this property stores an empty string. |
Additional recognition information | ||
HighlightSuspiciousCharacters | VARIANT_BOOL |
Specifies if uncertainly recognized characters should be have the IsSuspicious property set to TRUE. The name of the property reflects the fact that ABBYY FineReader highlights suspicious characters in text with background color, which makes manual verification easier for the operator. By default, this property is TRUE. |
ExactConfidenceCalculation | VARIANT_BOOL |
If this property is TRUE, character and word confidence will be defined more accurately, but recognition speed may get slower. Note: The value of character confidence is stored in the CharConfidence property of the CharacterRecognitionVariant and PlainText objects. The value of word confidence is stored in the WordConfidence property of the WordRecognitionVariant object. This property is automatically set to TRUE if the SaveCharacterRecognitionVariants or SaveWordRecognitionVariants property is TRUE. By default, this property is FALSE. |
SaveCharacterRegions | VARIANT_BOOL |
Specifies whether the exact characters regions (ICharParams::CharacterRegion) are saved. The default value is FALSE. |
SaveCharacterRecognitionVariants | VARIANT_BOOL |
Specifies whether the variants of characters recognition are saved. Note: The ICharParams::CharacterRecognitionVariants property returns a collection of recognition variants for a character. See also Using Voting API. The default value is FALSE. |
SaveWordRecognitionVariants | VARIANT_BOOL |
Specifies whether the variants of recognition of a word are saved. Note: The IParagraph::GetWordRecognitionVariants method and ICharParams::WordRecognitionVariants property return a collection of recognition variants for a word. See also Using Voting API. The default value is FALSE. |
Methods
Name | Description |
---|---|
CopyFrom | Initializes properties of the current object with values of similar properties of another object. |
LoadFromFile | Restores the object contents from a file on disk. |
SaveToFile | Saves the object contents into a file on disk. |
SetPredefinedTextLanguage | Sets the language of recognition to be one of the predefined ABBYY FineReader Engine languages. |
Related objects
Output parameter
This object is the output parameter of the CreateRecognizerParams method of the Engine object.
Input parameter
This object is passed as an input parameter to the following methods:
- Preprocess, PreprocessPages, Analyze, AnalyzePages methods of the FRDocument object
- Preprocess, Analyze, AnalyzeRegion, AnalyzeTable, IsEmpty methods of the FRPage object
Samples
The object is used in the following code samples: CustomLanguage, CommandLineInterface.
See also
Tuning Parameters of Preprocessing, Analysis, Recognition, and Synthesis
7/3/2024 8:50:25 AM