Tips for Document Separation

When you process large numbers of images, for example, arriving from the scanning queue of a high-speed scanner, various techniques for separating the image flow into multi-page documents may be useful:

  1. Use blank pages for separation.

When scanning your documents, make sure that a blank page separates every two documents.

  1. Use barcode pages for separation.

Print barcode pages which would denote the beginning of a new document and insert these pages between documents when scanning. You can use the type or the value of barcode printed on a delimiter page. For an example of barcode separator page, see SampleImages folder (located in the Code Samples Folder).

  1. If all the documents you are processing have the same number of pages (e.g. multi-page questionnaires), you can separate image flow into documents by page count.

Setting up the image grouping parameters

The ways of document separation described above are provided by the properties of ImageGroupingParams object. You can create this object using the CreateImageGroupingParams method of the Engine object. Configure its properties in this way:

  1. To use blank pages as delimiters, set GroupingRule property to PG_GroupPagesByEmptyPageDelimiters. If the blank pages must be deleted afterwards, set the DeleteDelimiters property to TRUE.
  2. To use barcode pages for document separation, set GroupingRule property to PG_GroupPagesByBarcode. Set the BarcodeType property to the type of barcodes you printed on special pages and/or the BarcodeValue property to the value of barcodes you use. See below for some tips on improving the quality of barcode recognition.

If the barcode pages must be deleted afterwards, set the DeleteDelimiters property to TRUE.

  1. To separate documents by pages count, set GroupingRule property to PG_GroupPagesByPagesCount. Specify the number of pages at which the new document must be created in the PagesCount property.

This object can be used when working with a preconfigured FlexiCapture project:

  • If you are importing images to an existing batch with the help of the IBatch::AddImages method, pass the configured ImageGroupingParams object as an input parameter.
  • If you are importing images from a hot folder or a scanner, get the ImageImportParams property of your hot folder profile and assign the configured ImageGroupingParams object to its corresponding property (IImageImportParams::ImageGroupingParams).

Now you can also import images using the ImportImages method of the Project object. It takes the reference to the ImageImportParams object as an input parameter. Pass the reference to the subobject of your hot folder profile, and the selected hot folder will be used for images import with the grouping rule you just specified.

Recognizing barcodes

If you choose the second method of image grouping, please pay attention to our recommendations on improving barcode recognition:

  • A barcode must be separated from other text by a fairly wide white gap.
  • Barcode size and the width of its separate bars or dots must meet the following requirements:
    • The optimal barcode height is more than 10 millimeters. The size of a barcode should be less than size A4.
    • Barcode height must be greater than the double height of a text line.
    • For non-square barcodes, length must be greater than height.
    • For 1D barcodes, the width of the thinnest bar in the barcode must be at least 3-5 pixels in terms of pixels of the image.
    • For 2D barcodes, the dimensions of their cells should be at least 2x2 pixels, the recommended size is 4x4 pixels or more. Besides, for all 2D barcodes except PDF417, the cells should be square, because if the 2D barcode is stretched, it will most likely be recognized incorrectly.
  • We do not recommended compressing images of barcodes using JPEG compression as it makes barcode borders fuzzy.
  • We do not recommended skewing barcodes, i.e. an angle of the barcode should be a multiple of 90 degrees relative to the horizontal axis.
  • The grayscale scanning mode is the best for OCR purposes. When scanning in black-and-white, adjust the brightness setting. If the barcode is “torn” or very light, lower the brightness to make the image darker. If the barcode is distorted or its parts are glued together, increase the brightness to make the image brighter.
  • Avoid printing barcodes in frames.
  • Avoid printing barcodes over a text or a picture.

See also

Scanning

15.08.2023 13:19:30

Usage of Cookies. In order to optimize the website functionality and improve your online experience ABBYY uses cookies. You agree to the usage of cookies when you continue using this site. Further details can be found in our Privacy Notice.