PlainText Object (IPlainText Interface)

This object represents recognized text in a special "plain text" format. It provides information only about the recognized text symbols, their recognition confidence and positions as relative to the source image. You can receive this information either for a single character or for all the characters in the text.

Note: In case of barcode recognition, the Text property of the PlainText object is empty. If you need to obtain the recognized text for a one-page document containing a single barcode, use the Text property of the first block in the layout:

document.Pages[0].Layout.Blocks[0].GetAsBarcodeBlock().Text
    

Properties

Name Type Description
Application Engine, read-only Returns the Engine object.
Main attributes
Text BSTR, read-only

Provides access to the whole recognized text in a form of Unicode string. This string may contain the following special characters:

  • 0x2028 — Line break symbol
  • 0x2029 — Paragraph break symbol
  • 0xFFFC — Object replacement character (denotes an embedded picture inside the text)
  • 0x0009 — Tabulation
  • 0x005E — Circumflex accent (^), used by ABBYY FineReader Engine as a replacement for unrecognized characters
  • 0x00AC — Soft hyphen

Note: If the image has tables, text from the table cells will be stored in the logical reading order (left-to-right and top-down).

SymbolsCount int, read-only Returns the number of symbols in the text, including the special characters.
PageNumber int, read-only This property takes as the input parameter the index of a symbol in the recognized text. It returns the number of the page on which the specified symbol is located.
Voting API
CharConfidence int, read-only

Returns the value of recognition confidence of the character. This is the confidence of the main (selected) recognition variant.

Confidence estimates the probability that a recognition variant is correct. It should not be understood as a general recognition quality measure: the only safe use of confidence is for comparing recognition variants of the same character. The characters extracted from the source PDF file without recognition have confidence set to 100. See also What is the difference between the CharConfidence and the IsSuspicious properties?

To calculate character confidence more accurately, set the IRecognizerParams::ExactConfidenceCalculation property to TRUE.

IsSuspicious VARIANT_BOOL, read-only

This property takes as the input parameter the index of a symbol in the recognized text. It returns TRUE if the character was recognized unreliably.

In ABBYY FineReader suspicious characters are highlighted with background color in the recognized text. See also What is the difference between the CharConfidence and the IsSuspicious properties?

Character coordinates
Bottom int, read-only This property takes as the input parameter the index of a symbol in the recognized text. It returns the coordinate of the bottom border of the symbol's rectangle as relative to the deskewed black-and-white plane of the source image.
Left int, read-only This property takes as the input parameter the index of a symbol in the recognized text. It returns the coordinate of the left border of the character's rectangle as relative to the deskewed black-and-white plane of the source image.
Right int, read-only This property takes as the input parameter the index of a symbol in the recognized text. It returns the coordinate of the right border of the symbol's rectangle as relative to the deskewed black-and-white plane of the source image.
Top int, read-only This property takes as the input parameter the index of a symbol in the recognized text. It returns the coordinate of the top border of the symbol's rectangle as relative to the deskewed black-and-white plane of the source image.

Methods

Name Description
GetCharacterData Returns the information about all characters in the text as a set of arrays: the page numbers on which the characters are located, the coordinates of characters' rectangles, and characters' confidences.
SaveToAsciiXMLFile Saves the recognized text into an XML file.
SaveToTextFile Saves the recognized text into a text file with the specified encoding.

Related objects

Object Diagram

See also

Working with Text

Working with Properties

9/17/2024 3:14:40 PM

Usage of Cookies. In order to optimize the website functionality and improve your online experience ABBYY uses cookies. You agree to the usage of cookies when you continue using this site. Further details can be found in our Privacy Notice.