SourceContentReuseModeEnum
SourceContentReuseModeEnum enumeration constants describe available modes of source PDF files and Office documents contents reusing.
typedef enum {
CRM_Auto,
CRM_DoNotReuse,
CRM_ContentOnly,
CRM_ContentAndPictures
} SourceContentReuseModeEnum;
Elements
Name | Description |
---|---|
CRM_Auto | ABBYY FineReader Engine automatically selects the appropriate mode for processing PDF files or Office documents. If the result of this mode’s work does not meet the expectations or document type and the corresponding mode of reusing are known in advance, then the mode may be selected manually. |
CRM_ContentAndPictures | ABBYY FineReader Engine automatically selects the mode of using the text and the images for the parts of every page of the source PDF file or Office document. It means that both the source file contents and rasterized images are used for processing. If the text from the source file is considered to be good, then it is used; otherwise, the text from the raster of this part is used. |
CRM_ContentOnly |
ABBYY FineReader Engine uses both text and images of the source PDF file or Office document. We recommend using this mode for the source file with the visible text, which is encoded using Unicode, ASCII, or another character encoding standard and has the correct settings of fonts and sizes. If your source file is of another type, use CRM_Auto, CRM_ContentAndPictures, or CRM_DoNotReuse. Important! This mode is not available when processing the documents in parallel (MultiProcessingParams::MultiProcessingMode = MPM_Parallel) in memory. |
CRM_DoNotReuse | ABBYY FineReader Engine rasterizes the pages of the source PDF file or Office document and processes them. The contents of the source file are ignored. |
Note: Use the IsPdfWithTextualContent method to find out if the file contains a text layer.
Remarks
Recognition of document contents is performed along with the process of determining the type of word model (see IWord::ModelType). This process depends on the selected mode of contents reusing:
- CRM_DoNotReuse - a type of word model is always determined.
- CRM_Auto, CRM_ContentAndPictures - determination of a type of word model depends on the use of document contents recognition.
- CRM_ContentOnly - a type of word model is never determined.
Used in
03.07.2024 8:50:25