Using vendor and business unit databases
When processing an invoice, ABBYY FlexiCapture will first look for the vendor (i.e. the company that issued the invoice) and the business unit (i.e. the company or department that received the invoice).
The vendor in most cases determines the language of the invoice and the formatting of its numbers, dates, and amounts. The business unit determines the tax rates and applicable regulatory requirements. ABBYY FlexiCapture will use all of this information when capturing invoice data and applying validation rules.
To help the program locate the vendor and business unit fields, you can use vendor and business unit databases. These must be connected to the data sets in your Document Definition. (A data set is a local copy of a table from an external database that stores information about vendors or business units. The values in a data set can be typified and normalized. A data set may be periodically updated to synchronize it with its external database. See Preparing vendor and business unit databases for more information.)
Vendor and business unit databases can also be used for automatic and manual data checks.
You do not have to use vendor and business unit databases, but note the following:
- If vendor and business unit databases are used, ABBYY FlexiCapture can more reliably detect companies by looking for additional company information on invoice images. It can also make sure that a detected vendor is a "known" vendor. Additionally, if a vendor database is used, you will be able to train fields.
- If no vendor or business unit databases are used, ABBYY FlexiCapture will use neural networks to look for vendor and business unit names and addresses on invoice images (see Detecting the main fields for details). Note also that you will not be able to train fields, as ABBYY FlexiCapture cannot reliably identify vendors without a vendor database.
- If you have no vendor databases but still want to use field training, you can accumulate company information by adding records to your data sets while capturing invoices. For more information, see Looking up vendors and business units in the database.
Vendor and business unit databases are connected in the same way:
- Open the Project menu and click Document Definitions....
- Click the Edit... button in the dialog box that opens.
- Open the Document Definition menu and click Document Definition Properties....
- Open the Data Sets tab in the dialog box that opens.
- Double click the BusinessUnits (Vendors) row or click the Set Up... button. The Data Set dialog box will open.
A data set is essentially a table with data. It contains the necessary columns and their types. These columns and types cannot be changed by the user, who needs to match each column in the data set with a column from an external database. If a database does not contain a table with the necessary data, you will have to create a view with this data and specify this view as the table.
- Select Database from the Source drop-down list.
Note: Only select the Script source type if you plan to use a script to update the data set. For details, see Updating data sets.
- Specify the database connection string.
You can select one of the existing environment variables from the project or define a new string by clicking the Specify... button. Using an environmental variable from the project allows moving the project and connecting to databases with the same structure without editing the Document Definition.
- Select Data schema and Database table.
- Match the columns from the data set with their corresponding columns in the external database.
For each column in the data set, you can specify the corresponding column in the external database or specify the value.
The Match fields list contains:
- The names of columns in the data set
- The corresponding columns from the external database
By default, each field in the External Database column column of the data set contains the value. This means that no column from an external database has been specified.
- The Found column, which contains a checkmark if a match between the data set field and a database column has been established
- The Normalization column, which shows the type of normalization used for the value
Normalization makes it possible to treat several insubstantially different values as identical (See Normalization for details).
Select a column in the table and click the Edit... button.
- In the Data Set Column Mapping dialog box, select the matching database column from the Database columns list.
Invoices from the same vendor can contain different names of the vendor's company and different addresses. In cases when several values correspond to the same vendor (the same VATID), they must be placed in different columns of the table (or view) of the external database, such as Name1, Name2, Address1,..., AddressN.
You can then select Several columns... from the list when specifying matching columns and specify any number of columns from the database as matches for the data set column.
Note: The external database may contain, for example, only one value for the company name and street. However, these fields may have several common variants when they occur on invoices. To achieve the maximum quality of detection, the text in these fields printed on an invoice should be as close as possible to the corresponding text in the data set. See Complex columns in a data set.
To create more than one column, specify the number of permitted columns for a field in the Max variant count box. Operators will be able to add values to these columns later.
Note: If you specify a number that is less than the actual number of columns (the number of different ways the value of a field can be written), this number will be increased automatically when the data set is updated. If you specify a number greater than the actual number of columns, empty columns will be added.
If your database has no matching column for a data set column, please refer to the Vendors data set and BusinessUnits data set sections.
- Select an entry from the Normalization list that best fits the type of data in the column (See Normalization for details). If necessary, enable the Store normalized value option.
- Enable the Cannot be empty option if necessary.
Note: When enabled, the Cannot be empty option tells the program that a column has to contain values. If it turns out to contain no values, the record will be marked as invalid.
Important! If your Document Definition contains a Vendors data set that uses and external database as its source, and the value of the Vendor field in your Document Definition depends on the value of the Business Unit field, the Cannot be empty will be enabled for the BusinessUnitId column by default. If the Vendors data set uses another source, this option must be enabled manually.
- Save the changes you made in the Data Set Column Mapping dialog box.
- You can also use the Add... button to add custom columns to a data set and specify corresponding columns in an external database.
Note: You can use the Edit... and Delete buttons to manage complex columns. Predefined columns (indicated by the lock icon) cannot be deleted.
- Specify a database update period in the data set if automatic updates are required.
- To validate records using a script, enable the Check records using scripts option. For details, see Data set validation scripts.
- Save your changes.
The Data Sets tab
All data sets are listed on the Data Sets tab. This list includes some additional information:
- The name of the data set in the Name column
- The type of the selected source in the Source column
- The total number of records in the Record count column
Note: The total number of records will only be displayed if the Cache data option is enabled in the data set's properties.
- The status of the data set, i.e. whether it is enabled in the Connected column.
You can manage the data sets using the buttons to the right of the list:
- Set Up... opens the properties dialog box of the selected data set.
- Update updates the selected data set.
The Vendors and BusinessUnits data sets are stored within the project and must be synchronized with an external database, whose data may change over time. Synchronization occurs when the project is opened, but only if the time elapsed since the previous synchronization exceeds the specified update period. For more on updates, See Updating data sets.
- Check... checks the data set (data will be displayed in a separate window)
- Add... creates a new data set.
- Rename... renames the selected data set.
- Delete deletes the selected data set.
- View... opens a dialog box that lets you view and edit data from the selected data set.
- Clean Up deletes all records in the data set.
To view or edit a data set, select it on the Data Sets tab of the Document Definition properties dialog box and click the View... button.
All records belong to one of three record groups. You can select the record group you want to view from the Record group drop-down list.
- General contains only valid records.
- Deleted contains records that were deleted by a user. These records are not used during recognition and will not be displayed as results for a database search. If the data set is updated, these records will not be restored.
- Invalid contains invalid records.
Click the Set... button to sort and filter records by their values in one of the list's columns. The Reset button disables the filter.
Different operations are available for each of the record groups:
|Edit...*||Edits a record. Edited records remain in the list.||...||Edits a record. Edited records remain in the list.|
|Add...*||Adds a record to the list.||...||...|
|Delete||Moves the record to Deleted.||...||Moves a record to Deleted.|
|Restore||...||Restores the record (moves it to General).||Validates the record (moves it to General).|
* - adding and editing records is only possible if the Primary key option is enabled. Records in the PurchaseOrders and PurchaseOrderItems default data sets cannot be added or edited.
Note: The number of values you can specify for a field when adding or editing records in a data set cannot exceed the number you specified in the Max variant count box in the Data Set Column Mapping dialog box of the data set. You can use the and buttons in the Add Record and Edit Record dialog boxes to manage records.
Note: To apply sorting and filtering to newly added or edited records, set the filter again.
Database Check rules
The Invoice Document Definition contains rules that are used to check data in the Vendor and Business Unit groups of fields against the database. These are standard rules of the Database Check type with the names (Rejected) Vendor should be found in the database and (Rejected) Business Unit should be found in the database. See Rules to learn how to find rules in the list.
You can view the properties of these rules to find out which fields from the Vendor and Business Unit groups have to be exact matches of records in the data set (and, by extension, in the database), which ones do not have to match perfectly, and which fields are simply copied from the database. You can also specify a field to be used as a primary key for selecting a record from the data set.
You can read more about database check rules in See Database Check rule.
Database Check rules are applied automatically when the invoice is recognized, and after that each time an Operator changes fields in the group during verification.
Performing database lookups while processing invoices
If a vendor or a business unit has not been determined automatically, the Operator can select the appropriate record from the database (i.e. the data set).
Modifying vendor and business unit records and adding new ones
You can allow Operators to add new vendor and business unit records (if they are not found in the database) to data sets during verification and edit existing entries. This data will not be stored in external vendor and business unit databases, but it will be used during invoice processing and rule checks. For details, See Enabling additional program features for Operators.