Image Preprocessing

This scenario can be used to prepare images for further processing or to improve their visual quality (e.g., after scanning or prior to recognition).

This scenario may be used as part of other scenarios in the first stage of document processing, i.e., to prepare documents for recognition. Usage examples include creating uneditable document copies for archiving, getting editable versions of documents, and extracting meaningful data from documents.

In this scenario, image files are subjected to additional processing, such as:

  • Auto-detection of page orientation

Is very important for bulk input of images, when the direction in which document pages are scanned is unknown and can be different.

  • Automated image de-skewing

It is applied to scanned documents requiring compensation for image skew. ABBYY FineReader Engine provides several modes for de-skewing images: with pairs of black squares, lines, or lines of text.

  • Image despeckling

When scanning poor to medium-quality documents, you may get very noisy images with lots of dots or speckles on them. These speckles, when they appear close to the letters or numbers, may affect the quality of OCR. The size of the speckles to be removed may be specified by the user. Despeckling can be applied to an image as well as to any individual zone of the image.

  • Splitting facing pages of scanned books into two separate images

It is used for scanning books as double-spreads — for both left and right pages. The recognition quality is higher if the page is split into two, with each page corresponding to a single book page.

  • Splitting scanned page with multiple business cards into separate images

It is applied to multiple business cards scanned on one page. Each business card then can be processed and saved separately.

  • Lines straightening

When capturing text from scanned or photographed books, the text lines may be uneven and difficult to OCR. For accurate text recognition skew correction and straightening text lines should be performed.

  • Texture filtering

Texture filtering technology helps to filter out background "noise" such as color and texture, increasing accuracy for difficult-to-read documents such as newsprint, color documents, faxes, and copies.

  • Removing motion blur and ISO noise from digital photos

The system automatically identifies the typical defects commonly found in digital images, such as glare, ISO noise.

  • Clipping page margins

When there is a need to improve the appearance of the images, you may want to clip some image areas, e.g., excess margins on digital photos.

Once preprocessed, the images are saved in user-defined formats or forwarded to further processing.

Implementing the scenario

Below follows a detailed description of the recommended method of using ABBYY FineReader Engine 12 in this scenario.

Step 1. Loading ABBYY FineReader Engine

Step 2. Preprocessing images during opening

Step 3. Preprocessing images that are already opened

Step 4. Unloading ABBYY FineReader Engine

Required resources

You can use the FREngineDistribution.csv file to automatically create a list of files required for your application to function. For processing with this scenario, select in the column 5 (RequiredByModule) the following values:

Core

Core.Resources

Opening

Opening, Processing

If you modify the standard scenario, change the required modules accordingly. You also need to specify the interface languages, recognition languages and any additional features which your application uses (such as, e.g., Opening.PDF if you need to open PDF files, or Processing.OCR.CJK if you need to recognize texts in CJK languages). See Working with the FREngineDistribution.csv File for further details.

Additional optimization

These are the sections of the help file where you can find additional information about setting up the parameters for the various processing stages:

  • Image preprocessing
    • Working with Images
      Working with images in ABBYY FineReader Engine and setting up image opening and preprocessing parameters.
    • PrepareImageMode Object
      The parameters of this object affect image opening and preprocessing: skew correction, image inversion, mirroring, prepared image compression, resolution, rotation.
    • ImageDocument Object
      The main object which provides access to images. This method provides a number of image preprocessing methods applied to an open image: cropping, double page splitting, photo preprocessing, visual enhancements.
    • ImageModification Object
      Use this object for additional processing of source images (e.g., replacing some regions of an image with color).
    • Tips for Taking Photos
      Getting quality images from photo devices.
  • Saving images

See also

Basic Usage Scenarios Implementation

7/3/2024 8:50:10 AM

Usage of Cookies. In order to optimize the website functionality and improve your online experience ABBYY uses cookies. You agree to the usage of cookies when you continue using this site. Further details can be found in our Privacy Notice.