Data types of the text entry field
Some of the most important text field parameters, that can affect the recognition quality, preliminary check and field data modification, are data type settings.
ABBYY FlexiCapture uses data types when recognizing text in fields. Specifying the correct data type improves recognition quality and increases the confidence percentage of recognized symbols, thereby reducing the workload of the Data Verification Operator.
ABBYY FlexiCapture's data type creation features are very flexible. You can select data types from a list of existing types or create custom types using the regular expression alphabet or a user dictionary that is suitable for the task at hand.
You can specify data types of text fields on the Data tab of a field's property dialog box. This article covers settings that are common to all fields and settings of specific field types.
General settings for text fields
- Data type (drop-down list)
Use this drop-down list to specify the field's data type (i.e. the type of data you expect to see in the field), such as an address, time, date, a sum of money, a name, a code, text or a number. - Default value
The value automatically assigned to the text field if it turns out to be empty after the Document Definition is applied or the document is assembled.
The default value is also assigned to a field if the field's region is deleted. If the field was left empty by a user (this can be done, for example, on the verification stage), the default value will not be assigned to the field.
You can change the default value. If the default value is empty, fields will remain empty until a value is specified.Note: This option is available for text fields and text columns of tables.
- AutoCorrect options
These options determine the way the program automatically corrects the values of fields. Such corrections can include the replacement or deletion of extra characters or spaces and changing lower-case characters to the upper case and vice versa.
Automatic correction can be used to format data, decreasing the workload of Operators that prepare data for export. You can use the standard AutoCorrect options available in ABBYY FlexiCapture or a custom script, which can be particularly useful for performing complex formatting operations. For details, see Processing entered values. - Validation
This group of options contains settings that determine how values are checked, and set limits on values, such as limits on the number of characters, the minimum and maximum values for fields that contain numbers, or a range of permitted dates.
The Cannot be blank option in this group determines whether this field is required.
External sources of permitted values
The range of permitted values of a field can be limited using an external database, text file or data set. Limitation and checking parameters depend on the field's data type. If recognized data does not satisfy the checking parameters, formatting errors will be reported in the document window during verification.
Using an external source of permitted values will let you:
- Improve the recognition quality of a field by specifying permitted words in addition to permitted characters. Values from this list will be used by the program during recognition, and output values will be either identical to an item from the list or as close to a list item as possible. If the program cannot find recognized data in the dictionary, only the alphabet will be used during recognition. Use content type settings of fields to set up custom dictionaries.
- A field that has already been recognized can be checked against a local or external list of permitted values. If the recognized value cannot be found in the list, a formatting error warning will be displayed. You can load and set up a list of permitted values using the validation options of fields.
- Use normalization
Use this option to specify the desired format of text fields of the Time, Date, Amount of Money and Number types. This option is disabled by default, and recognized data is displayed in the form and exported in its original format.
Setting of text fields of specific types
Address
Text fields of the Address type have the following settings:
- Details
Options in this group let you change the recognition language, specify a custom dictionary and select more specific types of data, such as e-mail addresses and postal addresses. This group of options is displayed when you click the Edit... button to the right of the text field with a summary of the selected field's properties. - Language
The recognition language of the field. - Content settings: General
In this case we recommend connecting a custom dictionary in order to improve recognition quality. To do this, enable the Use custom dictionary option and click Edit... to edit the dictionary. You can create a custom dictionary by adding words to it using the Add... button or by importing words from a TXT or DIC file. To import words from a file, click the Load... button and specify the path to the file. - Content settings: Special
Use this option to select a data type from a list of predefined data types. The descriptions in the bottom part of the dialog will help you make the right choice. You can also create a custom data type if none of the predefined ones suit your needs. For details, see Creating custom data types. - Default value
The value automatically assigned to the text field if it turns out to be empty after the Document Definition is applied or the document is assembled. See above for details. - AutoCorrect options
Options from this group determine the way the program automatically corrects the values of fields. Such corrections can include the replacement or deletion of extra spaces, changing lower-case characters to the upper case and vice versa, and replacing specific characters or fragments of text. This group of options is displayed when you click the Edit... button to the right of the AutoCorrect options field. For details, see Processing entered values. - Validation
Use the options in this group to place constraints on the contents of fields, such as a limit on the amount of characters it can contain.
For details, see Value range constraints for text data type.
The Cannot be blank option in this group determines whether this field is required.
Amount of Money
Text fields of the Amount of Money type have the following settings:
- Process value as text
Enable this option if you want the value of a field to be processed as regular text. This can be useful when a sum of money cannot be brought to the standard format, or when you do not need to process it as a number. A custom dictionary can be used to check the value, and value range constraints can be specified in the same way as for text. For details, see Value range constraints for text data type. - Details
Use the settings in this group to change the recognition language, specify additional parameters for whole numbers and fractions, and select more specific data types such as two-digit numbers and Roman numerals. To change these options, click the Edit... button to the right of the field with a description of the selected field's parameters. - Language
The field's recognition language. - Content settings: General
This group contains additional settings for whole numbers and fractions.
For negative numbers, you can allow the minus sign to be on the right of the number (example: - 54), and allow negative numbers to be written as numbers in parentheses, making (54) equivalent to -54.
The following options are available for fractions: Must have fractional part, Allow more than two digits in fractional part: 54.3679 and Allow space as decimal mark: 54 36. - If the Only integer values option is selected, separators between the whole number and the fraction will be ignored. These include decimal points, commas, hyphens, equal signs and spaces.
Note: The Allow space as decimal mark: 54 36 option will work only if the One word option on the Recognition tab is disabled.
- Currencies from Document Definition properties
Contains a predefined list of major currency codes and symbols. To modify this list, click the Edit... button next to the Currencies from Document Definition properties option. In the Currency Symbols dialog box that opens, you can add or remove currency codes or symbols. For more information about extracting amounts of money, see Document Definition properties. - Content settings: Special
Use this option to select a data type from a list of predefined data types. The descriptions in the bottom part of the dialog will help you make the right choice. You can also create a custom data type if none of the predefined ones suit your needs. For details, see Creating custom data types. - Default value
The value automatically assigned to the text field if it turns out to be empty after the Document Definition is applied or the document is assembled. See above for details. - AutoCorrect options
Options from this group determine the way the program automatically corrects the values of fields. Such corrections can include the removal of extra spaces, changing lower-case characters to the upper case and vice versa, and replacing specific characters or fragments of text. This group of options is displayed when you click the Edit... button to the right of the AutoCorrect options field. For details, see Processing entered values. - Validation
Checks whether a value falls within a specified range. To specify this range, click the Edit... button to the right of the Validation field. For details, see Value range constraints for numbers.
The Cannot be blank option determines whether this field is required. - Use normalization
Use options in this group to specify a custom format for values of the Amount of Money type. Click the Edit Normalization Settings... button to open the Currency Normalization Settings dialog that contains these options. The top part of this dialog contains an example of how a positive number and a negative number will look like after the settings are applied. These settings are described below: - Minimum number of digits in integer part
Possible values range from 0 to 9. - Digit grouping
Separates groups of three digits in numbers with spaces. - Digit grouping symbol
The symbol that will be used to separate groups of digits. Possible values are space, comma, and period. - Decimal symbol
The symbol used to separate the integer part of a decimal number from the fractional part. Possible values are a period or a comma. - Minimum number of digits after decimal marker
The minimum number of digits in the fractional part of a decimal number. Possible values: 0 to 4. - Maximum number of digits after decimal marker
The maximum number of digits in the fractional part of a decimal number. Possible values: 0 to 4.
If this parameter is set to 0, the decimal fraction will be rounded to the nearest integer. - Use brackets for negative values
Enable this option if you want negative numbers to be enclosed in parentheses instead of preceded by a minus sign. - Currency format
Select the desired currency format from this drop-down list.Note: Normalized numbers have to be available for fields. If settings from the Use normalization group contradict settings from the Details group, an error will occur when the template is checked.
Important! Number data types shouldn't be used for fields that only contain digits but are more than just a number, such as card numbers and passport numbers. Using number data types for these fields can cause problems like the removal of zeroes from the beginning of values that start with zeroes. In such cases we recommend either creating a custom data type with an alphabet that includes all digits (from 0 to 9), using a regular expression. If you still want to use a number data type, you should enable the Process value as text option.
Code
Text fields of the Code (codes include letters and digits) type have the following settings:
- Details
Use the settings in this group to change the recognition language, specify a custom dictionary, and select more specific data types such as a social security number. To change these options, click the Edit... button to the right of the field with a description of the selected field's parameters. - Language
The recognition language for the field. - Content settings: General
In this case we recommend connecting a custom dictionary in order to improve recognition quality. To do this, enable the Use custom dictionary option and click Edit... to edit the dictionary. You can create a custom dictionary by adding words to it using the Add... button or by importing words from a TXT or DIC file. To import words from a file, click the Load... button and specify the path to the file. - Content settings: Special
Use this option to select a data type from a list of predefined data types. The descriptions in the bottom part of the dialog will help you make the right choice. You can also create a custom data type if none of the predefined ones suit your needs. For details, see Creating custom data types. - Default value
The value automatically assigned to the text field if it turns out to be empty after the Document Definition is applied or the document is assembled. See above for details. - AutoCorrect options
Options from this group determine the way the program automatically corrects the values of fields. Such corrections can include the removal of extra spaces, changing lower-case characters to the upper case and vice versa, and replacing specific characters or fragments of text. This group of options is displayed when you click the Edit... button to the right of the AutoCorrect options field. For details, see Processing entered values. - Validation
Options in this group can be used to specify constraints on a field's contents, including the number of characters. For details, see Value range constraints for text data type.
The Cannot be blank option determines whether this field is required.
Date
Text fields of the Date type have the following settings:
- Process value as text
Enable this option if you want the value of a field to be processed as regular text. This can be useful when a date cannot be brought to the standard format, or when you do not need to process it as a date. A custom dictionary can be used to check the value, and value range constraints can be specified in the same way as for text. For details, see Value range constraints for text data type. - Details
This group of options can be used to change the recognition language, change the order of different parts of the date, add the time or day of the week, and select a more specific data type, such as a month in roman numerals. To change these options, click the Edit... button to the right of the field with a summary of the selected field's properties. - Language
The recognition language for the field. - Content settings: General
Settings for standard date types include: May include month in words (permits the month to be written in letters in addition to digits), May include day of week (permits the date field to include the day of the week), and May include time (permits the time to be included in the date field). - Content settings: Special
Use this option to select a data type from a list of predefined data types. The descriptions in the bottom part of the dialog will help you make the right choice. You can also create a custom data type if none of the predefined ones suit your needs. For details, see Creating custom data types. - Order of date components
Select one or more possible date types. If several date types are selected and the Use first acceptable date format option is not enabled, the suitable date format will be selected by the program. - Default value
The value automatically assigned to the text field if it turns out to be empty after the Document Definition is applied or the document is assembled. See above for details. - AutoCorrect options
Options from this group determine the way the program automatically corrects the values of fields. Such corrections can include the removal of extra spaces, changing lower-case characters to the upper case and vice versa, and replacing specific characters or fragments of text. This group of options is displayed when you click the Edit... button to the right of the AutoCorrect options field. For details, see Processing entered values. - Validation
Checks if the date falls within a specified range. To specify this range, click the Edit... button to the right of the Validation field. For details, see Values range constraints for date.
The Cannot be blank option determines whether this field is required. - Use normalization
Options in this group can be used to specify a custom output format of values of the Date type. To change these settings, click the Edit Normalization Settings... button. The top part of the Date Normalization Settings dialog contains examples of entries as they will appear after all settings are applied. These settings are described below: - Date format
The output format for dates. - Date delimiter
The character that will be used to separate different parts of the date. - Has day of week
Indicates that the day of the week is the first part of the date. If no day of the week can be found during recognition, the program will determine the day of the week based on the date. - Has time
Indicates that the date is followed by the time. If no time can be found during recognition, the value 00:00 will be added to the date. The format of the time is determined by the two following options. - Time format
The output format of time values. - Time delimiter
The character used to separate parts of the time value.
Important! Normalized numbers have to be available for fields. If settings from the Use normalization group contradict settings from the Details group, an error will occur when the template is checked.
Important! The program includes an intellectual data detection mechanism.
For example, if the program encounters an ambiguous date (say 01/02/02), it will offer a list of possible values based on the settings specified.
If the date contains a day of the week that does not match the date, a formatting error will be issued during verification.
Values with spelling or recognition errors will be automatically replaced with the correct ones(e.g. "Wensday " will be changed to "Wednesday" and "Septembere" will be changed to "September").
Name
Text fields of the Name type have the following settings:
- Details
Use the settings in this group to change the recognition language, specify a custom dictionary, and select more specific data types such as a last name or a surname. To change these options, click the Edit... button to the right of the field with a description of the selected field's parameters. - Language
The recognition language for the field. - Content settings: General
In this case we recommend connecting a custom dictionary in order to improve recognition quality. To do this, enable the Use custom dictionary option and click Edit... to edit the dictionary. You can create a custom dictionary by adding words to it using the Add... button or by importing words from a TXT or DIC file. To import words from a file, click the Load... button and specify the path to the file. - Content settings: Special
Use this option to select a data type from a list of predefined data types. The descriptions in the bottom part of the dialog will help you make the right choice. You can also create a custom data type if none of the predefined ones suit your needs. For details, see Creating custom data types. - Default value
The value automatically assigned to the text field if it turns out to be empty after the Document Definition is applied or the document is assembled. See above for details. - AutoCorrect options
Options from this group determine the way the program automatically corrects the values of fields. Such corrections can include the removal of extra spaces, changing lower-case characters to the upper case and vice versa, and replacing specific characters or fragments of text. This group of options is displayed when you click the Edit... button to the right of the AutoCorrect options field. For details, see Processing entered values. - Validation
Options in this group can be used to specify constraints on a field's contents, including the number of characters. For details, see Value range constraints for text data type.
The Cannot be blank option determines whether this field is required.
Number
Text fields of the Number type have the following settings:
- Process value as text
Enable this option if you want the value of a field to be processed as regular text. This can be useful when a number cannot be brought to the standard format, or when you do not need to process it as a number. A custom dictionary can be used to check the value, and value range constraints can be specified in the same way as for text. For details, see Value range constraints for text data type. - Details
Use the settings in this group to change the recognition language, specify additional parameters for whole numbers and fractions, and select more specific data types such as two-digit numbers and Roman numerals. To change these options, click the Edit... button to the right of the field with a description of the selected field's parameters. - Language
The field's recognition language. - Content settings: General
This group contains additional settings for whole numbers and fractions.
For negative numbers, you can allow the minus sign to be on the right of the number (example: - 54), and allow negative numbers to be written as numbers in parentheses, making (54) equivalent to -54.
The following options are available for fractions: Must have fractional part, Allow more than two digits in fractional part and Allow space as decimal marker.
If the Only integer values is selected, separators between the whole number and the fraction will be ignored. These include decimal points, commas, hyphens, equal signs and spaces.Note: The Allow space as decimal marker option will work only if the One word option on the Recognition tab is disabled.
- Content settings: Special
Use this option to select a data type from a list of predefined data types. The descriptions in the bottom part of the dialog will help you make the right choice. You can also create a custom data type if none of the predefined ones suit your needs. For details, see Creating custom data types. - Default value
The value automatically assigned to the text field if it turns out to be empty after the Document Definition is applied or the document is assembled. See above for details. - AutoCorrect options
Options from this group determine the way the program automatically corrects the values of fields. Such corrections can include the removal of extra spaces, changing lower-case characters to the upper case and vice versa, and replacing specific characters or fragments of text. This group of options is displayed when you click the Edit... button to the right of the AutoCorrect options field. For details, see Processing entered values. - Validation
Checks if the date falls within a specified range. To specify this range, click the Edit... button to the right of the Validation field. For details, see Value range constraints for numbers.
The Cannot be blank option determines whether this field is required. - Use normalization
Use options in this group to specify a custom format for values of the Number type. Click the Edit Normalization Settings... button to open the Number Normalization Settings dialog that contains these options. The top part of this dialog contains an example of how a positive number and a negative number will look like after the settings are applied. These settings are described below: - Minimum number of digits in integer part
Possible values range from 0 to 9. - Digit grouping
Separates groups of three digits in numbers with spaces. - Digit grouping symbol
The symbol that will be used to separate groups of digits. Possible values are space, comma, and period. - Decimal symbol
The symbol used to separate the integer part of a decimal number from the fractional part. Possible values are a period or a comma. - Minimum number of digits after decimal marker
The minimum number of digits in the fractional part of a decimal number. Possible values: 0 to 4. - Maximum number of digits after decimal marker
The maximum number of digits in the fractional part of a decimal number. Possible values: 0 to 4.
If this parameter is set to 0, the decimal fraction will be rounded to the nearest integer. - Use brackets for negative values
Enable this option if you want negative numbers to be enclosed in parentheses instead of preceded by a minus sign.
Important! Normalized numbers have to be available for fields. If settings from the Use normalization group contradict settings from the Details group, an error will occur when the template is checked.
Important! Number data types shouldn't be used for fields that only contain digits but are more than just a number, such as card numbers and passport numbers. Using number data types for these fields can cause problems like the removal of zeroes from the beginning of values that start with zeroes. In such cases we recommend either creating a custom data type with an alphabet that includes all digits (from 0 to 9), using a regular expression. If you still want to use a number data type, you should enable the Process value as text option.
Text
Fields of the Text type have the following settings:
- Details
Use the settings in this group to change the recognition language, specify a custom dictionary, and select more specific data types such as a last name or a surname. To change these options, click the Edit... button to the right of the field with a description of the selected field's parameters. - Language
The recognition language for the field. - Content settings: General
In this case we recommend connecting a built-in or custom dictionary. The value of the field will be checked against this dictionary. To use a built-in dictionary, select the Use built-in dictionary option. To use a custom dictionary, enable the Use custom dictionary option and click Edit... to edit the dictionary. You can create a custom dictionary by adding words to it using the Add... button or by importing words from a TXT or DIC file. To import words from a file, click the Load... button and specify the path to the file.Important! DIC files are basically text files containing a list of words, with one word per line. This file format originated in Microsoft Office.
- Content settings: Special
Use this option to select a data type from a list of predefined data types. The descriptions in the bottom part of the dialog will help you make the right choice. You can also create a custom data type if none of the predefined ones suit your needs. For details, see Creating custom data types. - Default value
The value automatically assigned to the text field if it turns out to be empty after the Document Definition is applied or the document is assembled. See above for details. - AutoCorrect options
Options from this group determine the way the program automatically corrects the values of fields. Such corrections can include the removal of extra spaces, changing lower-case characters to the upper case and vice versa, and replacing specific characters or fragments of text. This group of options is displayed when you click the Edit... button to the right of the AutoCorrect options field. For details, see Processing entered values. - Validation
Options in this group can be used to specify constraints on a field's contents, including the number of characters. For details, see Value range constraints for text data type.
The Cannot be blank option determines whether this field is required.
Time
The following settings are available for fields of the Time type:
- Process value as text
Enable this option if you want the value of a field to be processed as regular text. This can be useful when a sum of money cannot be brought to the standard format, or when you do not need to process it as a number. A custom dictionary can be used to check the value, and value range constraints can be specified in the same way as for text. For details, see Value range constraints for text data type. - Details
Use the settings in this group to change the recognition language, change the formatting of times and select more specific data types such as time in the 12-hour format. To change these options, click the Edit... button to the right of the field with a description of the selected field's parameters. - Language
The recognition language for the field. - Content settings: General
Used for standard time notations. - Content settings: Special
Use this option to select a data type from a list of predefined data types. The descriptions in the bottom part of the dialog will help you make the right choice. You can also create a custom data type if none of the predefined ones suit your needs. For details, see Creating custom data types. - Default value
The value automatically assigned to the text field if it turns out to be empty after the Document Definition is applied or the document is assembled. See above for details. - AutoCorrect options
Options from this group determine the way the program automatically corrects the values of fields. Such corrections can include the removal of extra spaces, changing lower-case characters to the upper case and vice versa, and replacing specific characters or fragments of text. This group of options is displayed when you click the Edit... button to the right of the AutoCorrect options field. For details, see Processing entered values. - Validation
The Cannot be blank option in this group determines whether the field is required. - Use normalization
Use options in this group to specify a custom format for values of the Time type. Click the Edit Normalization Settings... button to open the Time Normalization Settings dialog and specify the format of times and the symbol used to separate parts of times (e.g. separate minutes from hours with a colon).
25.05.2023 7:55:02