SourceContentReuseModeEnum

SourceContentReuseModeEnum enumeration constants describe available modes of source PDF files and Office documents contents reusing.

typedef enum {
 CRM_Auto,
 CRM_DoNotReuse,
 CRM_ContentOnly,
 CRM_ContentAndPictures
} SourceContentReuseModeEnum;
    

Elements

Name Description
CRM_Auto ABBYY FineReader Engine automatically selects the appropriate mode for processing PDF files or Office documents. If the result of this mode’s work does not meet the expectations or document type and the corresponding mode of reusing are known in advance, then the mode may be selected manually.
CRM_ContentAndPictures ABBYY FineReader Engine automatically selects the mode of using the text and the images for the parts of every page of the source PDF file or Office document. It means that both the source file contents and rasterized images are used for processing. If the text from the source file is considered to be good, then it is used; otherwise, the text from the raster of this part is used.
CRM_ContentOnly

ABBYY FineReader Engine uses both text and images of the source PDF file or Office document.

We recommend using this mode for the source file with the visible text, which is encoded using Unicode, ASCII, or another character encoding standard and has the correct settings of fonts and sizes. If your source file is of another type, use CRM_Auto, CRM_ContentAndPictures, or CRM_DoNotReuse.

Important! This mode is not available when processing the documents in parallel (MultiProcessingParams::MultiProcessingMode = MPM_Parallel) in memory.

CRM_DoNotReuse ABBYY FineReader Engine rasterizes the pages of the source PDF file or Office document and processes them. The contents of the source file are ignored.

Note: Use the IsPdfWithTextualContent method to find out if the file contains a text layer.

Remarks

Recognition of document contents is performed along with the process of determining the type of word model (see IWord::ModelType). This process depends on the selected mode of contents reusing:

  • CRM_DoNotReuse - a type of word model is always determined.
  • CRM_Auto, CRM_ContentAndPictures - determination of a type of word model depends on the use of document contents recognition.
  • CRM_ContentOnly - a type of word model is never determined.

Used in

IObjectsExtractionParams::SourceContentReuseMode

17.09.2024 15:14:40

Usage of Cookies. In order to optimize the website functionality and improve your online experience ABBYY uses cookies. You agree to the usage of cookies when you continue using this site. Further details can be found in our Privacy Notice.