Standard recognition options
For the Standard recognition mode, specify the following:
- ICR - Enable this option if the field contains handwritten or hand-printed static text. As the style of writing can vary significantly from country to country, you need to specify the appropriate country.
Show sample writing styles for digits
Russian "2" Regular "5" (can be found in any style) Japanese "5" Russian "1" German "1" Regular "9" (can be found in any style) Czech "9" (with an articulated circle)
- OCR (printed) - Enable this option if the field contains machine-printed static text. Select the print style from the drop-down list (typographic, dot-matrix printer, typewriter, etc.). See also Supported text types.
- Advanced - Use this option to select several text types or to specify a custom text type. This option also allows you to load a pattern file in PTN or FBT format. To load a pattern, click Modify.... In the dialog box that opens, select Use pattern and specify the path to the file.
A pattern is a set of pairs "a character image — the keyboard character" created through pattern training. A user pattern is a source of additional information for the program. Pattern training is useful for:
- texts set in decorative fonts
- texts containing unusual characters (e.g. mathematical symbols)
Note: When using a pattern file in FBT format:
- Only uncertainly recognized characters will be replaced with values from the pattern file when a field is recognized.
- All characters will be replaced with values from the pattern file during full-text recognition.
Important! Pattern training is not supported for Asian languages.
- Marking type - Select the marking type; to simplify your choice, select samples from the drop-down list. If the marking disappears during scanning, your marking type is monospaced (Grey boxes). If the marking does not disappear during scanning and is divided into cells for character spaces, you must enter the total number of cells. When the program detects fields with such marking automatically, the number of cells is also determined automatically.
Note: For some marking types (Grey boxes, Simple и Underlined), words that have been divided at the end of a line are automatically joined back together. If the selected marking type is either Simple or Underlined, words are joined only if a hyphen was used to split them. If the Grey boxes marking type is used, split words are detected and joined even if no hyphen was used. A word is joined back together if it is detected in the dictionary.
- Letter case - Select the case of letters in the field. If both lowercase and uppercase letters are possible, leave the Auto option enabled.
- Orientation - Specify the text orientation.
- Direction of CJK text - Select the reading direction to be used for OCR of texts in Chinese, Japanese, or Korean. Possible options are Auto, Horizontal Script, or Vertical Script. Auto is selected by default and is the recommended option for fields that do not contain any CJK text.
- One line - Select this option for fields that will always consist of a single line. Using this option ensures that text in the field will never be interpreted as multi-line text because of poorly recognized writing or characters of varying height.
Note: Disable this option for multi-line fields.
- For fields whose value will always consist of a single word, enable the One word option. You can also enable this option if you want to apply a regular expression to the entire field irrespective of the number of words in it.
Note: With the One word option enabled, it is not recommended to have expressions in the custom dictionary that contain the space character.
Specify image pre-processing settings:
- Invert inverts image colors and brightness during recognition (this inversion is temporary and only affects recognition; original image colors will be retained in the output file).
- Autodetect automatically detects the text color and background color and inverts them if necessary. This is the recommended setting for documents that contain both light text on a dark background and dark text on a light background.
- Invert inverts images completely.
- Don't invert keeps original colors (this option is enabled by default).
- Remove texture removes texture.
- Despeckle - Enable this option to remove garbage from the image.
- Clear the garbage of specified size only - Enable this option if you want to remove garbage of only specified size. Specify garbage size. If this option is disabled and the Despeckle option is enabled, garbage size will be selected automatically.