TextExportParams Object (ITextExportParams Interface)
This object provides functionality for tuning parameters of recognized text export in TXT or CSV format by means of ABBYY FineReader Engine export functions. To select the format of export, use the ExportFormat property. In CSV the following formatting applies:
- Original lines are retained
- Lines containing separator symbols are quoted(" ")
- Quotes inside other quotes are duplicated
A pointer to this object is passed into the export methods as an input parameter, and thus affects the results of export. All properties of a newly created object of this type are set to reasonable defaults. For more information about the default value of this or that property, see the description of the corresponding property.
The TextExportParams object is a persistent object. This means that the object's current state can be written to persistent storage on disk. Later, the object can be re-created by reading the object's state from persistent storage. The following methods provide persistence of the object: SaveToFile, and LoadFromFile.
Properties
Name | Type | Description |
---|---|---|
Application | Engine, read-only | Returns the Engine object. |
Main settings | ||
ExportFormat | TXTExportFormatEnum |
Specifies the format of export: TXT, CSV with full layout retained, or CSV with text from tables only. By default, the value of the property is TEF_TXT, which means that export to TXT format is performed. |
Format settings | ||
ExportParagraphsAsOneLine | VARIANT_BOOL |
Specifies if each paragraph in the recognized text is exported as one line. This property is ignored if the LayoutRetentionMode property is set to a non-default value. This property is FALSE by default. |
InsertEmptyLineBetweenParagraphs | VARIANT_BOOL |
Specifies if an empty line should be inserted between paragraphs. This property is ignored if the LayoutRetentionMode property is set to a non-default value. This property is FALSE by default. |
LayoutRetentionMode | TextLayoutRetentionModeEnum |
Manages the settings of layout emulation during export to TXT format. The default value of this property is TXTLRM_Auto. |
RetainLayout | VARIANT_BOOL |
This property is deprecated and will be deleted in future versions. To simulate the original layout with the help of spaces, set the LayoutRetentionMode property to TXTLRM_ExactCopy. This property is FALSE by default. |
UsePageBreaks | VARIANT_BOOL |
Specifies if page break symbols (0x12) will be inserted between pages when multiple pages are exported into TXT or CSV format. This property is FALSE by default. |
AppendToEnd | VARIANT_BOOL |
Specifies if exported text is appended at the end of file if it already exists. This property is FALSE by default. |
WriteBomCharacter | VARIANT_BOOL |
Specifies whether the byte order mark (BOM) should appear at the start of the text stream when the document is exported to TXT format in UTF-8 encoding. This property is TRUE by default. |
WriteRunningTitles | VARIANT_BOOL |
Specifies whether the running titles should be saved to an output TXT file. This property is TRUE by default. |
TabSeparator | BSTR |
Stores the string with which the table separators are replaced in the exported text. This property is taken into account during export to CSV and TXT formats. By default, the value of the table separator is "\t". |
Encoding | ||
EncodingType | TextEncodingTypeEnum |
Specifies the encoding type of the output file in TXT or CSV format. This property is TET_Auto by default. This means that encoding is selected in the following way:
|
CodePage | CodePageEnum |
This property specifies the code page to which the recognized text is exported. The value of this property is taken into account only when the EncodingType property has value TET_Simple (exported text is not Unicode). If this property does not specify any code page (CP_Null), the code page is selected automatically. By default, this property is CP_Null. |
Methods
Name | Description |
---|---|
CopyFrom | Initializes properties of the current object with values of similar properties of another object. |
LoadFromFile | Restores the object contents from a file on disk. |
SaveToFile | Saves the object contents into a file on disk. |
Output parameter
This object is the output parameter of the CreateTextExportParams method of the Engine object.
Input parameter
This object is passed as the input parameter to the following methods:
- Export, ExportPages, ExportToMemory methods of the FRDocument object
- Export method of the FRPage object
- RecognizeImageFile method of the Engine object
Samples
The object is used in the following code samples: CommandLineInterface.
See also
7/3/2024 8:50:25 AM