Recognizing Checkmarks
ABBYY FineReader Engine 12 supports two block types for checkmarks: checkmark and checkmark group. A checkmark group block is a collection of checkmark blocks. These block types have the corresponding constants BT_Checkmark and BT_CheckmarkGroup in the BlockTypeEnum enumeration. The CheckmarkBlock and CheckmarkGroup objects provide access to the blocks of these types. To receive these objects, you should use the corresponding methods of the Block object.
Important! To recognize checkmarks, you must have an ABBYY FineReader Engine license that supports the OMR module.
You can recognize single checkmarks as well as checkmark groups.
One check box corresponds to one CheckmarkBlock object. Possible check box statuses: checked, not checked, corrected. They correspond to CheckmarkCheckStateEnum. A corrected checkmark is a checkmark that was put in the check box and then was crossed out by the user.
For a checkmark group, you can specify a minimum and a maximum number of checked check boxes in the group (MinimumCheckedInGroup and MaximumCheckedInGroup, respectively). These values can be set through the CheckmarkGroup object and will be used during recognition.
All the checkmarks within a checkmark group must have the same values for the IsCorrectionEnabled and CheckmarkType properties.
The state of the checkmark is calculated according to the percentage of black pixels in the region. It is the smallest for the unchecked checkmark, greater for the checked checkmark, and the largest for the corrected checkmark. Therefore for the right recognition result, it is essential to:
- set the checkmark type correctly because the CMT_Circle and CMT_Square type checkmarks have a black frame, which needs to be taken into account when calculating the percentage;
- specify the exact region of the checkmark because the percentage of black would be calculated across the whole region, and if some irrelevant areas are included in the region, the estimate may deteriorate.
Recognizing a group of checkmarks
- Create a FRDocument object from an image with a checkmark group. For example, you can use the CreateFRDocumentFromImage method of the Engine object.
- Obtain the page with the image of the checkmarks from the collection of pages of the document (IFRDocument::Pages) — use the properties and methods of the FRPages collection.
- Obtain the Layout object which corresponds to this page via the IFRPage::Layout property.
- For each checkmark group:
- Create a Region object using the IEngine::CreateRegion method and add rectangles to it using the IRegion::AddRect method.
- Create a Block object of the checkmark group type and add it into the collection of layout blocks (ILayout::Blocks) by using the ILayoutBlocks::AddNew method (use the BT_CheckmarkGroup constant and the created Region object as input parameters. The method also requires the block index in the layout as the third input parameter).
- Obtain the CheckmarkGroup object (use the IBlock::GetAsCheckmarkGroup method).
Important! It is essential to set the type and the region of the checkmark correctly for the right recognition result.
- For each checkmark in the group:
- Create the Region object using the IEngine::CreateRegion method and add rectangles to it using the IRegion::AddRect method.
- Create a new checkmark block in the group by using the ICheckmarkGroup::AddNew method (use the created Region object as an input parameter).
- Obtain the CheckmarkBlock object (use the IBlock::GetAsCheckmarkBlock method) and set the required parameters (CheckmarkType, IsCorrectionEnabled).
- Set necessary parameters of the checkmark group (MinimumCheckedInGroup, MaximumCheckedInGroup).
- To recognize the checkmarks, use any of the available methods that perform recognition, such as IFRPage::Recognize, IFRPage::RecognizeBlocks, IFRDocument::Recognize, IFRDocument::RecognizePages, etc.
Recognizing a single checkmark
- Create a FRDocument object from an image with a checkmark. For example, you can use the CreateFRDocumentFromImage method of the Engine object.
- Obtain the page with the image of the checkmarks from the collection of pages of the document (IFRDocument::Pages) — use the properties and methods of the FRPages collection.
- Obtain the Layout object which corresponds to this page via the IFRPage::Layout property.
- Create the Region object using the IEngine::CreateRegion method and add rectangles to it using the IRegion::AddRect method.
- Create a Block object of checkmark type and add it into the collection of layout blocks (ILayout::Blocks) by using the ILayoutBlocks::AddNew method (use the BT_Checkmark constant and the created Region object as input parameters).
- Receive the CheckmarkBlock object (use the IBlock::GetAsCheckmarkBlock method) and set the required parameters (CheckmarkType, IsCorrectionEnabled).
Important! It is essential to set the type and the region of the checkmark correctly for the right recognition result.
- To recognize the checkmark, use any of the available recognition methods, such as IFRPage::Recognize, IFRPage::RecognizeBlocks, IFRDocument::Recognize, IFRDocument::RecognizePages, etc.
Recognizing a checkmark of custom type
ABBYY FineReader Engine can recognize checkmarks of the standard form: checkmarks in squares, checkmarks against an empty background, and checkmarks in circles (see the CheckmarkTypeEnum constants). As you can see in the description of CheckmarkTypeEnum enumeration constants, there is one more checkmark type that can be recognized — CMT_Custom. It is intended for checkmarks of non-standard type. If the images you are going to recognize contain checkmarks of non-standard type, you can train FineReader Engine to recognize this type of checkmarks.
To recognize checkmarks of non-standard type:
- Find an image with some unchecked checkmarks of the type which you want to recognize. It can be an image of an empty form that contains the checkmarks.
- Create a FRDocument object from this image. For example, you can use the CreateFRDocumentFromImage method of the Engine object.
- Obtain the page with the image of the checkmarks from the collection of pages of the document (IFRDocument::Pages) — use the properties and methods of the FRPages collection.
- Obtain the Layout object which corresponds to this page via the IFRPage::Layout property.
- Specify the region and type of each checkmark block on the page:
- Create the Region object using the IEngine::CreateRegion method and add rectangles of a checkmark region to it using the IRegion::AddRect method.
- Create a Block object of checkmark type and add it into the collection of layout blocks (ILayout::Blocks) by using the ILayoutBlocks::AddNew method (use the BT_Checkmark constant and the created Region object as input parameters).
- Obtain the CheckmarkBlock object (use the IBlock::GetAsCheckmarkBlock method) and set its CheckmarkType property to CMT_Custom.
Important! It is essential to set the type and the region of the checkmark correctly for the right recognition result.
- Train FineReader Engine to recognize this type of checkmarks: call the LearnCheckmarks method of the FRPage object.
- As a result, the TrainingData property of the CheckmarkBlock object which you have created before training will contain information on the custom checkmark type. This information now can be used for recognition of other checkmarks of the same type. You can save it to file or memory using the SaveToFile method of the CheckmarkTrainingData object.
- Create FRDocument objects from the images which contain the checkmarks of this type, specify blocks of checkmarks on the pages, and set checkmark type to CMT_Custom. The procedure is described in steps 2 through 5.
- Initialize the TrainingData property of each CheckmarkBlock object with the CheckmarkTrainingData object obtained during training. For example, you can copy the object using the CopyFrom method or load it from file or memory using the LoadFromFile method of the CheckmarkTrainingData object.
- Call any of the recognition methods of the FRDocument or FRPage object, e.g., IFRDocument::Recognize method.
- The BlackThreshold and SuspiciousDistance properties of the CheckmarkBlock object enable you to further tune the settings if you are not satisfied with the recognition results. After training, the default values of these properties are replaced with the values that can be expected to work in most cases. When you load the CheckmarkTrainingData object for a checkmark block, the values of these properties are also loaded. You can experiment with changing the values of these properties and re-recognizing the checkmarks (repeating step 10), and when you have found the best configuration, save the CheckmarkTrainingData object again and use the new object to recognize the checkmarks of your custom type.
See also
7/3/2024 8:50:10 AM