Training User Patterns

If the IRecognizerParams::TrainUserPatterns property is set to TRUE, the Train User Pattern mode will be used during the recognition. Whenever an unknown character is encountered, the Pattern Training dialog will open, with the character image displayed within it.

Note: You can also use the IEngine::TrainUserPattern method to perform pattern training without showing the dialog. This method takes as input parameters the TrainingImagesCollection object, which stores a collection of character images, and the character itself.

Training to recognize a character

The frame in the top dialog window should enclose a single character, and this character must be fully enclosed by the frame. If the frame encloses only part of the character or more than one character, click the frame borders and move them so that the above-stated requirements are met. The and buttons move the frame border as well (and are useful for training italic symbols). Once you have positioned the frame correctly, type in the character and click the Train button.

Notes:

  • You may only train the system to read characters included in the alphabet. If you wish to train ABBYY FineReader Engine to read characters that cannot be entered from the keyboard, use a combination of two characters to denote these non-existent characters or copy the required character from the Character Table (click the button in the Pattern Training dialog to open the Character Table).
  • If you wish to train the system to retain character formatting, select the corresponding Italic or Bold item in the Pattern Training dialog before clicking the Train button.
  • Make sure that only uppercase/lowercase characters are entered when training uppercase/lowercase character images, respectively.

If you make a mistake during training, click the Back button to return the frame to its previous position. The last "image — character" pair to be entered will automatically be removed from the pattern. Note that this "undo" function is limited to the last word trained.

Training to recognize ligatures

A ligature is a combination of two or three characters "stuck" together, for example, fi, fl, ffi. These characters are difficult to separate because they are "stuck" together as part of the printing process. In fact, better results can be obtained by treating them as "single" compound characters.

Training ligatures is no different from training separate characters:

  1. Type the necessary character combination and click the Train button.
  2. The frame in the top dialog window should enclose the entire ligature. You can move the frame border using the mouse or by clicking the and buttons.

Each pattern may contain up to 1000 new characters. However, you should not create too many ligatures, as it may adversely affect the recognition quality.

Training limitations

You should also take the following limitations into account when you train ABBYY FineReader Engine:

  • ABBYY FineReader Engine does not differentiate between certain characters, which are usually considered different. Such images are recognized as one and the same character. For example, the straight ('), right (’), and left (‘) apostrophes are kept in the pattern as one character — the straight apostrophe. Thus, you will never see the right and left apostrophes in the recognized text, even if you try to train them.
  • In some cases, a certain image is recognized as a certain character depending on its environment.
  • Pattern training is not supported for CJK languages.

See also

Recognizing with Training

RecognizerParams

17.09.2024 15:14:40

Usage of Cookies. In order to optimize the website functionality and improve your online experience ABBYY uses cookies. You agree to the usage of cookies when you continue using this site. Further details can be found in our Privacy Notice.