Machine-Readable Zone Capture

The official travel or identity documents of many countries contain a machine-readable zone (MRZ) that ensures more accurate processing of the document data. The MRZ includes 2 or 3 lines with the OCR-B font text written in accordance with ICAO Document 9303 (see the specifications on the ICAO website).

This scenario is used for extracting data from a machine-readable zone on ID documents during customer onboarding or verification processes. The system recognizes MRZ on the document image and extracts the data from it. The extracted data contains several fields with the personal information about the document and its holder (document's type and expiry date, the first and the last names of the document holder, etc.). You may search through the fields, verify the data and save it to an external file for further processing.

To extract the data from MRZ, image files obtained by scanning or saved in the electronic format typically go through several processing stages, each of which has its own peculiarities:

  1. Preprocessing of scanned images or photos

You either scan or take a photo of an ID document's identity page with MRZ. Photos made with digital cameras of mobile devices may have low resolution and quality. Also, images may require some preprocessing prior to recognition.

  1. Extracting data from MRZ

No more than one MRZ may be captured from each image. The text of each of the 2 or 3 lines will be recognized and parsed to extract the data fields. Some of the fields and the MRZ as a whole have checksums, which will help you to verify the data.

  1. Export to an external file

You may also save the extracted data in an external format: XML and JSON are supported.

The procedure described below is implemented in the MRZExtraction code sample.

Implementing the scenario

Below is the detailed description of the recommended method of using ABBYY FineReader Engine 12 in this scenario. The proposed method uses processing settings that are most suitable for this scenario.

Step 1. Loading ABBYY FineReader Engine

Step 2. Loading settings for the scenario

Step 3. Loading and preprocessing the document images

Step 4. Extracting data from MRZ

Step 5. Working with the extracted data

Step 6. Exporting the extracted data

Step 7. Unloading ABBYY FineReader Engine

Required resources

You can use the FREngineDistribution.csv file to automatically create a list of files required for your application to function. For processing with this scenario, select in the column 5 (RequiredByModule) the following values:

Core

Core.Resources

Opening

Opening, Processing

Processing

Processing.OCR

Processing.OCR, Processing.ICR

Processing.OCR.NaturalLanguages

Processing.OCR.NaturalLanguages, Processing.ICR.NaturalLanguages

Export

Export, Processing

If you modify the standard scenario, change the required modules accordingly. You also need to specify the interface languages, recognition languages and any additional features which your application uses (such as, e.g., Opening.PDF if you need to open PDF files). See Working with the FREngineDistribution.csv File for further details.

Additional optimization

These are the sections of the Help file where you can find additional information about setting up the parameters for the various processing stages:

See also

Basic Usage Scenarios Implementation

03.07.2024 8:50:25

Usage of Cookies. In order to optimize the website functionality and improve your online experience ABBYY uses cookies. You agree to the usage of cookies when you continue using this site. Further details can be found in our Privacy Notice.