English (English)

User Dictionaries

A User Dictionary is an auxiliary dictionary created by the user which contains words not included in the built-in dictionaries. Built-in dictionaries can be supplemented with a user dictionary to improve the quality of data capture. Typically, a user dictionary will contain specialized terms, abbreviations, company names, etc.

You can set up and enable user dictionaries on the Languages tab of the Pre-recognition Properties dialog (to open this dialog, open the FlexiLayout or Classifier menu, click Properties and then click the Advanced pre-recognition properties... button).

Existing user dictionaries are listed in the User Dictionaries field along with the following information:

  • Enabled
    Indicates whether the User Dictionary is being used.
  • Name
    The name of the custom dictionary.
  • Language
    The language of the custom dictionary.

Creating and editing a custom dictionary

To add a custom dictionary, click the Add... button on the Languages tab of the Pre-recognition properties dialog. Then, in the Add New Dictionary dialog box that opens, specify the following properties:

  • Dictionary name
    The name of the dictionary.
  • Definition method
    The dictionary's type:
    • Dictionary file
      This type of dictionary is based on a DIC file.
    • Regular expression
      This type of dictionary is based on a regular expression.

Note.A dictionary's type cannot be changed after the dictionary has been created.

  • Is language-neutral
    If this option is enabled, the custom dictionary will be used alongside all built-in dictionaries for all enabled recognition languages. The Language parameter will be set to Neutral and this setting will become impossible to change. This option is only available for user dictionaries of the Regular expression type.
  • Language (locale)
    The dictionary's language can be selected from this drop-down list. A dictionary's language cannot be changed after the dictionary has been created.

When you click the OK button, a dialog box with a list of all editable and uneditable parameters will appear.

To view a dictionary's properties, select a dictionary and then click the Edit button. The General tab of a dictionary's properties dialog contains its general properties: the dictionary's name, type, language, and any user comments. The name of a dictionary can be changed after the dictionary has been created.

The Dictionary tab of a dictionary's properties dialog contains settings that depend on the dictionary's type and user alphabet settings.

  • If you select the Dictionary file type when creating the dictionary, click the button and specify the path to a DIC file or create a new dictionary by clicking the Create new dictionary.... If you decide to create a new dictionary, specify its name in the Create New Dictionary File dialog box and click OK, then specify its settings in the editor.

    Commands of the dictionary editor

When a dictionary is created, it is saved as a DIC file in the project folder.

  • If you selected the Regular expression dictionary type, specify the expression. You can click the button to open a menu that will help you create the expression. For details, see Alphabet used in regular expressions.

Note.Words from a user dictionary have a higher priority than words from a built-in dictionary. Enabling the Prefer words from dictionary further increases the priority of words from a dictionary.

User Alphabets

You can create a user alphabet Alphabet group on the Dictionary tab of a dictionary's properties dialog. A user dictionary is a set of characters, separators, prefixes and suffixes that may be used in a user dictionary. If a user alphabet is used together with a user dictionary, dictionary words that contain characters which are not in the user alphabet are considered to be non-dictionary words. In other words, a user alphabet can be used to limit the set of characters that are permitted for custom dictionaries.

To specify permitted characters, enable the Use custom alphabet option and then specify the alphabet's symbols in the text box or by using the editor. You can open the editor by clicking the Edit button.

To specify permitted separators, prefixes, suffixes and ignored characters (collectively referred to as punctuation marks in this section for simplicity's sake), click the Advanced parameters button. In the Advanced custom alphabet parameters dialog box, specify the punctuation marks.

  • Punctuation marks adjoining the beginning of word
    Punctuation marks that may adjoin the beginning of a word, such as the underscore in "_unknown".
  • Punctuation marks adjoining the end of word
    Punctuation marks that may adjoin the beginning of a word, such as the ampersand in "user&".
  • Standalone punctuation marks
    Punctuation marks that may occur separately, such as the vertical bar in "January |"

You can specify symbols which you want the program to ignore when checking words against the user dictionary in the Exclusion characters group of options. For example, if you specify the regular expression "+33NNNNNNNN and the hyphen (-) as ignored character, the program will consider "+33-11111111", "+33-111-11-111" and "+33-111-111-11" as matches for the regular expression.

25.05.2023 7:55:03

Please leave your feedback about this article

Usage of Cookies. In order to optimize the website functionality and improve your online experience ABBYY uses cookies. You agree to the usage of cookies when you continue using this site. Further details can be found in our Privacy Notice.