HTMLExportParams Object (IHTMLExportParams Interface)

This object provides functionality for tuning parameters of recognized text export in HTML format by means of ABBYY FineReader Engine export functions. A pointer to this object is passed into the export methods as an input parameter, and thus affects the results of export. All properties of a newly created object of this type are set to reasonable defaults. For more information about the default value of this or that property, see the description of the corresponding property.

The HTMLExportParams object is a persistent object. This means that the object's current state can be written to persistent storage on disk. Later, the object can be re-created by reading the object's state from persistent storage. The following methods provide persistence of the object: SaveToFile, and LoadFromFile.

Properties

Name Type Description
Application Engine, read-only Returns the Engine object.
Format settings
HTMLFormatMode HTMLFormatModeEnum

Specifies the version of HTML used for export.

Note: If you set the value of this property to HFM_Format40 or HFM_Format50, the HTMLSynthesisMode and SeparatePages properties cannot be set to HSM_PlainText and TRUE, respectively.

The default value is HFM_Format40.

HTMLSynthesisMode HTMLSynthesisModeEnum

Specifies a mode of synthesizing HTML code from the recognized text. There exist four modes of synthesis: retain paragraphs only, retain paragraphs and fonts, retain the whole structure of the document, retain the exact copy of the document.

Note: If you set the value of this property to HSM_PlainText, the value of the HTMLFormatMode property cannot be set to HFM_Format40 or HFM_Format50, and the value of the UseCss property must be set to FALSE.

The default value is HSM_FlexibleLayout, which means that the whole structure of the document is retained.

UseDocumentStructure VARIANT_BOOL

Specifies whether the logical structure of the document should be used for recreating the structure of the output document. If this property is FALSE, the structure of the output document is recreated on the basis of the layout blocks.

Note: This property cannot be set to TRUE, if the value of the SeparatePages property is TRUE.

This property is TRUE by default.

SeparatePages VARIANT_BOOL

If this property is TRUE and several pages are exported to HTML format, <HR> tag is inserted between pages, making the browser display a horizontal rule.

Note: If this property is TRUE:

  • the logical structure of the document is not retained (the UseDocumentStructure property cannot be set to TRUE).
  • the value of the HTMLFormatMode property cannot be set to HFM_Format40 or HFM_Format50.
  • the value of the UseCss property must be set to FALSE.

The default value is FALSE.

UseCss VARIANT_BOOL

Determines if a stylesheet should be created. It may be written into a separate .css file or into the head of the HTML document.

Note: This property cannot be set to TRUE, if the value of the SeparatePages property is TRUE.

The default value of this property is TRUE.

SplitDocumentToFiles HTMLDocumentSplittingModeEnum

Specifies the mode of splitting output document into files.

By default, this property is HDSM_None.

KeepLines VARIANT_BOOL

Specifies if original lines in recognized text are retained during export.

This property is FALSE by default.

KeepTextAndBackgroundColor VARIANT_BOOL

Specifies if original colors of text and background are retained during export of the recognized text in HTML format.

This property is TRUE by default.

WriteRunningTitles VARIANT_BOOL

Specifies whether the running titles should be saved to an output HTML file.

This property is TRUE by default.

MetaDataWritingParams DocumentContentInfoWritingParams, read-only Specifies if the author, subject, title, and keywords of the document should be written into the output file. These parameters of the document are defined in the DocumentContentInfo subobject of the FRDocument object.
Encoding
EncodingType TextEncodingTypeEnum

Specifies the encoding type of the output file in HTML format.

Note: If you want to change the value of this property to TET_Simple, you should first set the correct code page (the CodePage property).

The default value of this property is TET_Auto. This means that encoding is selected depending on the constant which was used to specify the export format:

CodePage CodePageEnum

This property specifies the code page to which the recognized text is exported. The value of this property is taken into account only when the EncodingType property has the value TET_Simple (exported text is not Unicode), and in this case, the property must specify a valid code page (it cannot be CP_Null).

Note: First, set the correct code page and only then change the value of the EncodingType property to TET_Simple.

By default, this property is CP_Null.

Picture embedding
WritePictures VARIANT_BOOL

Specifies whether pictures must be saved along with the file in HTML format. If pictures are not written, references to them in HTML files are also omitted.

The default value is TRUE.

PictureExportParams PictureExportParams, read-only Specifies the image format and JPEG quality which should be used for embedded pictures in the output file.

Methods

Name Description
CopyFrom Initializes properties of the current object with values of similar properties of another object.
LoadFromFile Restores the object contents from a file on disk.
SaveToFile Saves the object contents into a file on disk.

Related objects

Object Diagram

Output parameter

This object is the output parameter of the CreateHTMLExportParams method of the Engine object.

Input parameter

This object is passed as the input parameter to the following methods:

See also

Tuning Export Parameters

Working with Profiles

Working with Properties

03.07.2024 8:50:10

Usage of Cookies. In order to optimize the website functionality and improve your online experience ABBYY uses cookies. You agree to the usage of cookies when you continue using this site. Further details can be found in our Privacy Notice.