PlainText Object (IPlainText Interface)
This object represents recognized text in a special "plain text" format. It provides information only about the recognized text symbols, their recognition confidence and positions as relative to the source image. You can receive this information either for a single character or for all the characters in the text.
Note: In case of barcode recognition, the Text property of the PlainText object is empty. If you need to obtain the recognized text for a one-page document containing a single barcode, use the Text property of the first block in the layout:
document.Pages[0].Layout.Blocks[0].GetAsBarcodeBlock().Text
Properties
Name | Type | Description |
---|---|---|
Application | Engine, read-only | Returns the Engine object. |
Main attributes | ||
Text | BSTR, read-only |
Provides access to the whole recognized text in a form of Unicode string. This string may contain the following special characters:
Note: If the image has tables, text from the table cells will be stored in the logical reading order (left-to-right and top-down). |
SymbolsCount | int, read-only | Returns the number of symbols in the text, including the special characters. |
PageNumber | int, read-only | This property takes as the input parameter the index of a symbol in the recognized text. It returns the number of the page on which the specified symbol is located. |
Voting API | ||
CharConfidence | int, read-only |
Returns the value of recognition confidence of the character. This is the confidence of the main (selected) recognition variant.
Confidence estimates the probability that a recognition variant is correct. It should not be understood as a general recognition quality measure: the only safe use of confidence is for comparing recognition variants of the same character. The characters extracted from the source PDF file without recognition have confidence set to 100. See also What is the difference between the CharConfidence and the IsSuspicious properties? To calculate character confidence more accurately, set the IRecognizerParams::ExactConfidenceCalculation property to TRUE. |
IsSuspicious | VARIANT_BOOL, read-only |
This property takes as the input parameter the index of a symbol in the recognized text. It returns TRUE if the character was recognized unreliably. In ABBYY FineReader suspicious characters are highlighted with background color in the recognized text. See also What is the difference between the CharConfidence and the IsSuspicious properties? |
Character coordinates | ||
Bottom | int, read-only | This property takes as the input parameter the index of a symbol in the recognized text. It returns the coordinate of the bottom border of the symbol's rectangle as relative to the deskewed black-and-white plane of the source image. |
Left | int, read-only | This property takes as the input parameter the index of a symbol in the recognized text. It returns the coordinate of the left border of the character's rectangle as relative to the deskewed black-and-white plane of the source image. |
Right | int, read-only | This property takes as the input parameter the index of a symbol in the recognized text. It returns the coordinate of the right border of the symbol's rectangle as relative to the deskewed black-and-white plane of the source image. |
Top | int, read-only | This property takes as the input parameter the index of a symbol in the recognized text. It returns the coordinate of the top border of the symbol's rectangle as relative to the deskewed black-and-white plane of the source image. |
Methods
Name | Description |
---|---|
GetCharacterData | Returns the information about all characters in the text as a set of arrays: the page numbers on which the characters are located, the coordinates of characters' rectangles, and characters' confidences. |
SaveToAsciiXMLFile | Saves the recognized text into an XML file. |
SaveToTextFile | Saves the recognized text into a text file with the specified encoding. |
Related objects
See also
9/17/2024 3:14:40 PM