HTMLExportParams Object (IHTMLExportParams Interface)
This object provides functionality for tuning parameters of recognized text export in HTML format by means of ABBYY FineReader Engine export functions. A pointer to this object is passed into the export methods as an input parameter, and thus affects the results of export. All properties of a newly created object of this type are set to reasonable defaults. For more information about the default value of this or that property, see the description of the corresponding property.
The HTMLExportParams object is a persistent object. This means that the object's current state can be written to persistent storage: an area in the global memory or a disk file. Later, the object can be re-created by reading the object's state from persistent storage. The following methods provide persistence of the object: SaveToFile, LoadFromFile, SaveToMemory, and LoadFromMemory.
Properties
Name | Type | Description |
---|---|---|
Application | Engine, read-only | Returns the Engine object. |
Format settings | ||
HTMLFormatMode | HTMLFormatModeEnum |
Specifies the version of HTML used for export. Note: If you set the value of this property to HFM_Format40 or HFM_Format50, the HTMLSynthesisMode and SeparatePages properties cannot be set to HSM_PlainText and TRUE, respectively. The default value is HFM_Format40. |
HTMLSynthesisMode | HTMLSynthesisModeEnum |
Specifies a mode of synthesizing HTML code from the recognized text. There exist four modes of synthesis: retain paragraphs only, retain paragraphs and fonts, retain the whole structure of the document, retain the exact copy of the document. Note: If you set the value of this property to HSM_PlainText, the value of the HTMLFormatMode property cannot be set to HFM_Format40 or HFM_Format50, and the value of the UseCss property must be set to FALSE. The default value is HSM_FlexibleLayout, which means that the whole structure of the document is retained. |
UseDocumentStructure | VARIANT_BOOL |
Specifies whether the logical structure of the document should be used for recreating the structure of the output document. If this property is FALSE, the structure of the output document is recreated on the basis of the layout blocks. Note: This property cannot be set to TRUE, if the value of the SeparatePages property is TRUE. This property is TRUE by default. |
SeparatePages | VARIANT_BOOL |
If this property is TRUE and several pages are exported to HTML format, <HR> tag is inserted between pages, making the browser display a horizontal rule. Note: If this property is TRUE:
The default value is FALSE. |
UseCss | VARIANT_BOOL |
Determines if a stylesheet should be created. It may be written into a separate .css file or into the head of the HTML document. Note: This property cannot be set to TRUE, if the value of the SeparatePages property is TRUE. The default value of this property is TRUE. |
SplitDocumentToFiles | HTMLDocumentSplittingModeEnum |
Specifies the mode of splitting output document into files. By default, this property is HDSM_None. |
KeepLines | VARIANT_BOOL |
Specifies if original lines in recognized text are retained during export. This property is FALSE by default. |
KeepTextAndBackgroundColor | VARIANT_BOOL |
Specifies if original colors of text and background are retained during export of the recognized text in HTML format. This property is TRUE by default. |
WriteRunningTitles | VARIANT_BOOL |
Specifies whether the running titles should be saved to an output HTML file. This property is TRUE by default. |
MetaDataWritingParams | DocumentContentInfoWritingParams, read-only | Specifies if the author, subject, title, and keywords of the document should be written into the output file. These parameters of the document are defined in the DocumentContentInfo subobject of the FRDocument object. |
Encoding | ||
EncodingType | TextEncodingTypeEnum |
Specifies the encoding type of the output file in HTML format. Note: If you want to change the value of this property to TET_Simple, you should first set the correct code page (the CodePage property). The default value of this property is TET_Auto. This means that encoding is selected depending on the constant which was used to specify the export format:
|
CodePage | CodePageEnum |
This property specifies the code page to which the recognized text is exported. The value of this property is taken into account only when the EncodingType property has the value TET_Simple (exported text is not Unicode), and in this case, the property must specify a valid code page (it cannot be CP_Null). Note: First, set the correct code page and only then change the value of the EncodingType property to TET_Simple. By default, this property is CP_Null. |
Picture embedding | ||
WritePictures | VARIANT_BOOL |
Specifies whether pictures must be saved along with the file in HTML format. If pictures are not written, references to them in HTML files are also omitted. The default value is TRUE. |
PictureExportParams | PictureExportParams, read-only | Specifies the image format and JPEG quality which should be used for embedded pictures in the output file. |
Methods
Name | Description |
---|---|
CopyFrom | Initializes properties of the current object with values of similar properties of another object. |
LoadFromFile | Restores the object contents from a file on disk. |
LoadFromMemory | Restores the object contents from the global memory. |
SaveToFile | Saves the object contents into a file on disk. |
SaveToMemory | Saves the object contents into the global memory. |
Related objects
Output parameter
This object is the output parameter of the CreateHTMLExportParams method of the Engine object.
Input parameter
This object is passed as the input parameter to the following methods:
- Export, ExportPages, ExportToMemory methods of the FRDocument object
- Export method of the FRPage object
- RecognizeImageFile method of the Engine object
- OnExportPages, OnSendToPages method of the IDocumentViewerEvents interface
Samples
The object is used in the following code samples: CommandLineInterface.
See also
17.09.2024 15:14:40