2. ProcessDialog Box: Workflow Properties, Tab: Process

The 2. Process tab contains recognition options.

Option name Option description
Document languages Specifies recognition languages. Languages are sorted alphabetically and are divided into two groups: the first group includes languages with full dictionary support, while the second group includes languages without dictionary support. For more information, see List of Recognition Languages.
Select language automatically

The document language will be detected automatically using the languages selected in the Document languages list.  

Note. Selecting this option may slow down ABBYY FineReader Server when processing texts in non-European languages.

Optimize OCR for Specifies whether recognition should be optimized for quality or for speed.
Use custom dictionary Specifies the path to the custom dictionary that should be used during recognition. A custom dictionary is a UTF-16 text file where each line of text represents a separate word.
Processing mode

Specifies the recognition mode:

  • All Text (Extract Text from Pictures)


    The program will recognize all text it finds in the document, including any text in the picture areas.
  • Text and Pictures (Don't Extract Text from Pictures)
    The program will recognize all text except the text found in the picture areas.
  • Barcodes Only
    Select this mode if you want to extract only the barcode values from documents.
    • Note. With this option on, pages without barcodes are treated as blank pages.
    • Note. Barcodes of type "Code 39 without an asterisk" cannot be recognized in this mode.

Advanced Settings...

(button)

Opens the Advanced Processing Settings dialog box.
PDF processing mode
  • Auto
    Input PDF documents are analyzed. Documents without a text layer and documents with scanned or recognized text will be subjected to OCR, while documents with a text layer obtained from an Office file will be exported "as is."
  • Always use OCR
    Input PDF documents will be subjected to OCR regardless of the availability of a text layer.
  • Always use text in PDF file
    Input PDF documents without a text layer will be subjected to OCR. Documents with a text layer will be exported "as is."
Keep original pictures and comments in scanned PDF

The original image layer, notes, and comments will be preserved in the output files.

Note. The original image layer can only be preserved for JPEG files.

Don't modify PDF files with digital signatures The text areas in PDF documents will be subjected to OCR, but the original documents will remain intact and their digital signatures will be preserved.
Detect bad encoding in PDF files

If an input PDF document has a text layer, the program will check its encoding. If bad encoding is detected, the program will perform OCR on the document.

Note. Enabling this option will slow down document processing.

Office documents processing mode From the drop-down list, you can select a Microsoft Office or LibreOffice application to be used for processing Office documents (i.e. *.doc, *.docx, *.odt, *.html, *.htm, *.txt, *.rtf; *.xls, *.xlsx, *.ods; *.ppt, *.pptx, and *.odp files).

See also

Workflow Properties Dialog Box

29.08.2023 11:55:30

Please leave your feedback about this article

Usage of Cookies. In order to optimize the website functionality and improve your online experience ABBYY uses cookies. You agree to the usage of cookies when you continue using this site. Further details can be found in our Privacy Notice.