If the Program Fails to Recognize Some of the Characters

Download

ABBYY FineReader uses data about the document language when recognizing text. The program may fail to recognize some characters in documents with uncommon elements (e.g. code numbers) because the document language might not contain these characters. To recognize such documents, you can create a custom language that has all of the necessary characters. You can also assign multiple languages to language groups and use these groups for recognition.

Creating a user language

  1. On the Tools menu, click Language Editor…
  2. In the Language Editor dialog box, click New…
  3. In the dialog box that opens, select the Create a new language based on an existing one option, then select the language that will be used as the base for your new language and click OK.
  4. The Language Properties dialog box will open. In this dialog box:
    1. Type the name of the new language.
    2. The base language you selected earlier is displayed in the Source language drop-down list. You can change the source language.
    3. The Alphabet contains the alphabet of the base language. Click the button if you want to edit the alphabet.
    4. There are several options concerning the Dictionary that will be used by the program when recognizing text and checking the result:
      • None

The language will not have a dictionary.

  • Built-in dictionary

The program's built-in dictionary will be used.

  • User dictionary

Click the Edit… button to specify dictionary terms or import an existing custom dictionary or a text file with Windows-1252 encoding (terms must be separated by spaces or other characters that are not in the alphabet).

Note: Words from the user dictionary will not be marked as wrong when spelling in the recognized text is checked. They may be written in all lower-case or all upper-case letters, or may begin with an upper-case letter.

Word in the dictionary Words that will not be considered wrong during a spelling check
abc abc, Abc, ABC
Abc abc, Abc, ABC
ABC abc, Abc, ABC
aBc aBc, abc, Abc, ABC
  • Regular expression

You can create a custom language dictionary using regular expressions.

For details, see "Regular Expressions."

  1. Languages can have several additional properties. To change these properties, click the Advanced… button.

The Advanced Language Properties dialog box will open. Here you can specify:

  • Characters that can be in the beginning or end of a word
  • Non-letter characters that appear separately from words
  • Characters that may appear in the middle of words and should be ignored
  • Characters that cannot appear in text that is recognized using this dictionary (prohibited characters)
  • All recognizable characters from the language
  • You can also enable the Text may contain Arabic numerals, Roman numerals, and abbreviations option
  1. You can now select the newly created language when choosing document languages.

See "Document Features to Consider Prior to OCR" for more information about document languages.

By default, the user language is saved in the FineReader document folder. You can also save all user languages and user patterns as a single file. To do so, on the Tools menu, click Options… to open the Options dialog box, click the Read tab, and then click the Save to File... button.

Creating a language group

If you are going to use a particular language combination regularly, you may wish to group the languages together for convenience.

  1. On the Tools menu, click Language Editor….
  2. In the Language Editor dialog box, click New….
  3. In the New Language or Group dialog box, select Create a new group of languages and click OK.
  4. In the Language Group Properties dialog box, type a name for your new group and select the desired languages.

Note: If you know that your text will not contain certain characters, you may wish to explicitly specify these so-called prohibited characters. Specifying prohibited characters can increase both recognition speed and quality. To specify prohibited characters, click the Advanced… button in the Language Group Properties dialog box. In the Advanced Language Group Properties dialog box, type the prohibited characters in the Prohibited characters field.

  1. Click OK.

The newly created group will be added to the Document Languages drop-down list on the main toolbar.

By default, user language groups are saved in the FineReader document folder. You can also save all user languages and user patterns as a single file. To do so, on the Tools menu, click Options… to open the Options dialog box, click the Read tab, and then click the Save to File… button.

Tip: If you need a particular language combination for a document, you can also select the desired languages directly, without creating a group.

  1. From the Document Languages drop-down list, select More languages….
  2. In the Language Editor dialog box, select Specify languages manually.
  3. Select the desired languages and click Cancel.

14.01.2020 17:26:19

Please leave your feedback about this article

Usage of Cookies. In order to optimize the website functionality and improve your online experience ABBYY uses cookies. You agree to the usage of cookies when you continue using this site. Further details can be found in our Privacy Notice.