Image pre-processing options
Images are pre-processed when they are being added to a batch. More specifically, they are distributed between documents, despeckled, and rotated. If needed, you can configure a more complex pre-processing like automatic cropping, deletion of color marks, etc. For low-quality images you can use special image enhancement profiles.
Images are edited to correct possible errors that might take place during processing of document photos (including taken from mobile devices) and document scans with complex background that impedes recognition and data extraction (for example, certificates, passports, etc). This stage allows you to improve the quality of recognition results for low-quality images.
You can select pre-processing options on the Image Processing tab. The pre-processing options can be selected for:
- A project.
In this case, the selected options will be used by default when new images are added manually into project batches. Select Project → Project Properties... in the main window and then click the Image Processing tab.
- An import profile.
These options will be used when this import profile is used to add new images. When creating a new import profile, select image pre-processing options at the Image Processing stage.
- Images added manually.
When adding images from a file, click the Image Processing Settings... button in the Load Images dialog box.
Note: If images are scanned manually, no image pre-processing is done. Scanned images are only pre-processed if an import profile is used.
- A batch type.
In this case, the selected options will be used when processing images from batches of the given type. When creating a batch type, select image pre-processing options on the Image Processing tab. The options specified for a batch type also apply to:
- Images received from the Scanning Station or Web Scanning Station if they were not assembled into documents on the corresponding station. The options specified in the Image processing section will apply to received images whether they are assembled into documents or not.
- Images added by Web Service API methods (if the ExcludeFromAutomaticAssembling flag is set to false). For details, please refer to the Web Services API help file.
- Images added from a Hot Folder (if the Use image processing options from batch type option is selected). See Image import profiles for more details.
- Images added manually from a file. In this case, the options specified for the batch type are used by default.
Users can select any of the following image pre-processing options:
- Delete empty pages. Mark this option for blank pages to be skipped for processing.
- Options that determine the principles of adding images into documents:
- Automatically when Document Definition is applied Select this option if you wish images to be separated into documents during matching (during recognition). In this case, images will not be separated into documents immediately when they are added, and documents will be assembled only after the pages have been analyzed and recognized based on the structure described in the appropriate Document Definition. For details, see Creating Document Definitions for multipage documents and Assembling pages into documents.
- For each image file If this option is selected, a new document will be created for each file (an image file can contain multiple pages).
- For images separated by A new document will be created when the program reaches a separator sheet. To use blank pages as separators, select blank pages from the drop-down list. In the Blank Page Detection dialog box that opens, specify the parameters based on which empty pages will be detected. To use pages with barcodes as separators, select pages with barcode from the drop-down list. You can set additional parameters by clicking the Settings... button. Note that if you specify a barcode value, the document identifier will be changed to that value. If you do not need separator pages to be added into the batch, select Delete separator pages. If this option is not checked, separator pages will be added into the batch and will become the first pages of each document.
Note: You do not have to use separator pages to separate pages into documents. Separator pages are indispensable in only one case: when the appearance of a page makes it impossible to determine whether it belongs to the previous document or to the next one. For example, this happens where a document may contain any number of identical pages, and the batch may contain more than one such document.
Note: For documents scanned or added on the Scanning Station, batch and document separation options are specified on the Scanning Station in the batch type settings.
- Options used to process incoming images:
- Basic image processing. Make basic changes to images. It is recommended to use this option for images of acceptable quality that do not require a more complex processing.
- Rotate images by
Rotates images in the specified direction.
- Convert color and gray images to black-and-white
Converts images to black-and-white.
- Despeckle images
Removes noise from images.
Note: If source text in images is very light or it includes thin type and if the size of elements to be despeckled is set by a user, it is necessary to make sure that significant elements like punctuation marks and thin letter elements are not deleted when the Despeckle images function is enabled.
- Remove all color marks
Removes all color marks from an image.
Recommended for photos
These options are recommended to be used when importing photos of documents.
- Correct resolution of photos
Automatically select optimal resolution for photos.
The program automatically detects page borders on an image and crops fields that contain data to be extracted and corrects skews and distortions.
- Reduce ISO noise
Reduces digital noise on an image.
- Whiten background
Whitens the background of an image.
Deskew (will be applied to scans only)
By default, these options are selected and the program deskews newly added images relying on black squares and/or on vertical and horizontal separators and text. It is not recommended to turn off the deskewing options unless deskewing is done incorrectly (e.g. if the program interprets a stamp or a graph as a separator and attempts to deskew the image based on this faulty data).
- Use black separators to correct skew
If this option is selected, images are deskewed based on separator lines.
- Use black squares to correct skew
If this option is selected, images are deskewed based on black squares.
- Use text to correct skew
If this option is selected, images are deskewed based on the text on the image.
- Rotate images by
- Use image enhancement profile. Use a special image enhancement profile. This option is recommended to be used for specific images fed in a uniform or mixed flow that require more complex editing instruments.
If you intend to process photos that are fed in the mixed flow of images and require a set of tools, which differs from the basic one, mark the Use special profile for photos option and select the second profile.
- Store original image during processing. This option allows you to save original images in the file storage. With this option you can get back to the original image if significant data was deleted after automatic processing.
Note: Saving original images will increase the space taken up by a project and result in slower processing speed. That's why it is recommended to enable this option only when you really need to get back to the original image, e.g. when features like image cropping or stamp removal are enabled in the pre-processing options.
- PDF processing options:
- Auto (FlexiCapture will choose between PDF text layer and OCR) – The optimal processing type will be chosen automatically depending on the availability and the quality of a text layer.
- Prefer PDF text layer if available – The text in the text layer will be used where available.
- Use OCR only – OCR will be performed on all documents, even on those with a text layer.
- Pre-processing options for files in office formats. In addition to processing imported document image files, you can also import documents in office formats and convert them to PDF using a built-in conversion module or third-party software.
To use third-party software for conversion to PDF, select either or both of the following options:
- Allow the use of LibreOffice® (supports LibreOffice 4.2, 4.3, 4.4, and 5)
- Allow the use of Microsoft® Office (supports Microsoft Office 2010, 2013, 2016, and 2019)
You can import document images and office files at the same time (see Supported input formats).
- If you import from e-mail, message bodies can be used as documents.
- If both office file pre-processing options are selected, ABBYY FlexiCapture will select the most suitable office application automatically. The selected application will be indicated in the respective task log.
- Conversion to PDF using a third-party application will only work if the respective application is installed on the processing station that is used to import documents.
- If you are using Microsoft Office to convert office files:
- Your copy of Microsoft Office must be activated.
- You must run the program as an administrator (click the Authentication... button to open the authentication settings dialog box).