Paragraph Object (IParagraph Interface)
This object exposes methods and properties for working with a single paragraph of the recognized text.
A paragraph in the ABBYY FineReader Engine object model is an elementary text unit. It is through this object that a user can get:
- the recognized text (use Text property for this purpose)
- different paragraph parameters (ExtendedParams, ListParams, ParagraphStyle properties)
- collections of paragraph lines and words (Lines and Words properties)
- a single character parameters (GetCharParams, SetCharParams and GetDropCapCharParams methods)
- bookmarks (Bookmark and UserBookmark properties)
Notes:
- The coordinates of the paragraph borders (Left, Top, Right, Bottom properties) are not available for the paragraphs of barcodes.
- Bookmarks in ABBYY FineReader Engine are internal (technical), or custom entities with names encoded using keywords (prefixes). These keywords and their vocabulary may vary depending on the version of the technologies used.
Properties
Name | Type | Description |
---|---|---|
Application | Engine, read-only | Returns the Engine object. |
Paragraph text, words, lines | ||
Text | BSTR, read-only |
Provides access to the recognized text of the paragraph in a form of Unicode string. It is through this property that you get the recognized text. This string may contain the following special characters:
Note: If the paragraph has right-to-left writing direction (like for Hebrew), the text of the paragraph is a string which contains characters of the paragraph in the order they are read. For example, the Hebrew text will be returned as the string "". Please note, that the recognized text can negligibly differ from the original. Some input symbols can be replaced with a special character. For example, "..." symbol can be replaced with tabulation. That is why the number of symbols in the recognized text can differ from the original. If you want to get access to the input word with no replaced symbols, use IWord::Text. |
Words | Words, read-only |
Provides access to the collection of the paragraph words. Note: In contrast with the Text property, if the paragraph has right-to-left writing direction (like for Hebrew), a word in the paragraph is a string which contains characters of the word from left to right. For example, the Hebrew word will be returned as the string "". |
Lines | ParagraphLines, read-only | Provides access to the collection of the paragraph lines. The property returns a constant object. |
Additional paragraph elements | ||
BookmarkCount | int, read-only | Returns the number of bookmarks in the paragraph. |
Bookmark | BSTR, read-only | Provides access to the bookmark of any type (technical or user) by its index in the internal collection of the paragraph's bookmarks. The bookmark accessed via this property contains a prefix in its name. |
Hyperlink | Hyperlink, read-only | Returns a reference to the Hyperlink object which describes the hyperlink in the position. If there is no hyperlink, this property is set to 0. |
TabPositions | TabPositions, read-only | Provides access to all tab stops in the paragraph. |
UserBookmark | BSTR, read-only | Provides access to the user bookmark by its index in the internal collection of the paragraph's bookmarks. The bookmark accessed via this property does not contain a prefix in its name. |
UserBookmarkCount | int, read-only | Returns the number of user bookmarks in the paragraph. |
Paragraph attributes | ||
Length | int, read-only |
This property contains the number of characters in paragraph. This value is the same as the number of characters in the string received through the Text property. Note: The paragraph break symbol at the end of the paragraph is included in the Text property and counted in the Length property. |
ExtendedParams | ParagraphParams | Provides access to the parameters of the Paragraph object exposed by the ParagraphParams object. |
ListParams | ListParams, read-only | Provides access to the parameters of the list to which the paragraph belongs. If the paragraph is not in the list, the IListParams::List property returns NULL. |
ParagraphStyle | ParagraphStyle |
Provides access to the parameters of the paragraph style. These parameters become accessible only after document synthesis. Note: The property returns a constant object. |
DropCapCharsCount | int | Provides access to the number of characters in the dropped capital of a paragraph. The first DropCapCharsCount symbols of the paragraph are assumed to be dropped capital. This property is not changed when paragraph is edited, so it may be greater than the length of the paragraph. |
ColumnNumber | int, read-only | Stores the number of the column to which the character in the position belongs. |
Coordinates | ||
Bottom | int, read-only |
Stores the coordinate of the bottom border of the paragraph as it is positioned on the image. Note: This property is not available for the paragraphs of barcodes. |
Left | int, read-only |
Stores the coordinate of the left border of the paragraph as it is positioned on the image. Note: This property is not available for the paragraphs of barcodes. |
Right | int, read-only |
Stores the coordinate of the right border of the paragraph as it is positioned on the image. Note: This property is not available for the paragraphs of barcodes. |
Top | int, read-only |
Stores the coordinate of the top border of the paragraph as it is positioned on the image. Note: This property is not available for the paragraphs of barcodes. |
Methods
Name | Description |
---|---|
DeleteBookmark | Deletes the specified bookmark of any type (technical or user) from the paragraph. |
GetBookmarkRange | Detects the index of the initial character and the length of the string that forms the bookmark by its name. |
GetCharParams | Provides access to parameters of a single character. |
GetDropCapCharParams | Provides access to the parameters of a paragraph's dropped capital. |
GetHyperlinkRange | Analyzes a single hyperlink character and detects the index of the initial character and the length of the string that forms the hyperlink. |
GetWordRecognitionVariants | Returns a collection of variants of a word's recognition in the current position inside the text of a paragraph. |
Insert | Inserts a string into the text of the paragraph. |
InsertParagraphBreak | Divides the paragraph into two parts. |
InsertTab | Inserts a tab stop into the chosen text position. |
InsertText | Inserts the specified text into the text of the paragraph. |
NextGroup | Finds the next character in the paragraph for which the selected parameters differ from the character with which the search begins. This method can be used to find all bold or italic words in the paragraph, all uncertainly recognized characters, etc. |
Range | Returns a substring from the text of the paragraph. |
Remove | Deletes a range from the text of the paragraph. |
SetBookmark | Sets a user bookmark to a string within a paragraph. |
SetCharParams | Sets parameters for a group of characters. |
SetHyperlink | Sets a hyperlink to a string within a paragraph. |
Related objects
Output parameter
This object is the output parameter of the following methods:
- Item method of the Paragraphs object
Input parameter
This object is the input parameter of the IndexOf method of the Paragraphs object.
Samples
C# code
The object is used in the following code samples: CustomLanguage, RecognizedTextProcessing; and demo tools: Camera OCR, Engine Predefined Processing Profiles, Image Preprocessing.
See also
17.09.2024 15:14:41