How to Recognize Text on Photos
This guide explains how Mobile Capture SDK can be used as a common OCR solution, recognizing text on existing images.
How it Works
Mobile Capture SDK provides access to single image processing functions, enabling the generic OCR functionality. This scenario works with any image file you can load to memory. It does not require access to the camera on the device.
Implementation
Note: Before you begin, see How to Add the Library to Your Android Studio Project.
To implement the image recognition scenario, follow these steps:
- Begin with the TextRecognitionCallback interface implementation. Its methods will be used to get status information and control the recognition process. Here are the brief recommendations on what the methods should do:
- The onProgress method is used to report recognition status. It also allows you to interrupt the recognition process.
- The onTextOrientationDetected provides information about the image normal orientation, which may be used for the image rotation.
- The onError method is for handling processing errors.
- Call the Engine.load method to create an engine object via which all other objects may be created. This object should be reused for every new operation and should not be created again in the same activity.
- Use the createRecognitionCoreAPI method of the Engine object to create a recognizer object (implementing the IRecognitionCoreAPI interface). Use this object on the thread on which it was created; you may also create several objects on different threads and use them concurrently. All IRecognitionCoreAPI interface method calls are synchronous (will not return until the operation is completed), so the recognizer should not be used on the UI thread.
- If you want to change recognition settings, use IRecognitionCoreAPI.getTextRecognitionSettings to get a TextRecognitionSettings object, then use its methods to set the recognition area and text language.
- If you are using a recognition language different from English, specify it using the TextRecognitionSettings.setRecognitionLanguage method. Multiple languages are also supported, although setting too many languages may decrease recognition performance.
- It is also recommended to call the TextRecognitionSettings.setAreaOfInterest method to specify the rectangular area of the image where to search for text. For example, your application may provide controls that allow user to select a smaller part of image for recognition if needed. Also, best results are achieved when between the area of interest and the text there is at least half the size of a typical printed character.
- You can also set the number of processing threads using the object returned by IRecognitionCoreAPI.getProcessingSettings (ProcessingSettings interface).
- To start recognition, call the recognizeText method of the IRecognitionCoreAPI interface. Its required input parameters are the bitmap to process and your instance TextRecognitionCallback object. The recognizer will start up several working threads and continue interacting with your application via the TextRecognitionCallback interface.
- When finished, the recognizeText method will return an array of TextBlock objects which contain the results of recognition for the text areas found on the image. Each TextBlock contains one or more text lines represented by TextLine objects. Each TextLine contains information about the bounding quadrangle for a single line of text and the recognized text as a string.
Work with the results on your side. - When the appropriate result was get, as well as on pausing or quitting the application, call the IRecognitionCoreAPI.close method to release resources.
See the description of classes and methods in the API Reference section.
02.03.2022 12:59:15