Keywords
Keywords for the Vendor / Business Unit and Invoice Header Data groups of fields
ABBYY FlexiCapture for Invoices uses keywords to search for fields on an image: if a keyword is detected, the program will look for a field in its vicinity. Common examples of keywords include field titles and captions.
For example, let us say an invoice contains the following text: Invoice Date: <value>. In this case "Invoice Date" will be the keyword used to find the InvoiceDate field.
Keywords can be specified using Located elements. Each Located element can only be attributed to one field in the Document Definition, but each field can have more than one Located element, each of which describes a separate model of the relationship between the keyword and the field on the invoice.
- Keywords are specific to languages and countries. When a Document Definition is applied, the program takes keywords from the vendor's country, the business unit's country, and from the languages of these countries.
- Normalization is applied to keywords.
- tabs, spaces, line break characters and diacritical marks are ignored
- letter case is ignored (the program makes no distinction between lower-case and upper-case letters)
For example, for keyword purposes Tax point is the same as taxpoint. - The program allows for 2 or 3 recognition errors in keywords (depending on the element).
- Sometimes a keyword appears in more than one language, for example Quantity/Anzahl. We recommend adding the whole keyword (Quantity/Anzahl) to both languages.
Field group | Located Elements | Description | Specified in the properties of |
---|---|---|---|
Vendor / Business Unit | Bank Account | Keywords for the Bank Account field | Country (the Keywords tab) |
Bank Code | Keywords for the Bank Code field | Country (the Keywords tab) | |
IBAN | Keywords for the IBAN field | Country (the Keywords tab) | |
National VATID | Keywords for the National VATID field | Country (the Keywords tab) | |
Total | Keywords for the Total field | Country (the Keywords tab) | |
Total tax | Keywords for the Total tax field | Country (the Keywords tab) | |
VATID | Keywords for the VATID field | Country (the Keywords tab) | |
Invoice Header Data | PurchaserNameLabels | Different possible names of the PurchaserName field in the language. | Language |
PurchaserNameFalseFieldPrefixes | Different possible names of fields that may be incorrectly identified as the PurchaserName field. We recommend specifying these only if you see specific cases. | Language | |
CreditNoteKeywords | Words in the language that may be used to identify the language as a Credit Note: | Language | |
InvoiceIdentifiers | Words which indicate that the document is an invoice. | Language | |
OrderNumberLabels | Possible names of the OrderNumber field in the language. | Language | |
InvoiceNumberExcludePreffixes | Words that are placed before numbers and may be omitted, such as "No." | Language | |
InvoiceNumberExcludeSuffixes | Words that are placed after numbers and may be omitted. We recommend specifying these only if you see specific examples. | Language | |
InvoiceNumberWithDateLabels |
Text that may precede an invoice's number, if that name is written in the same line as the date and they are separated by a slash or other character. Examples: Invoice Number / date 23061336 / 07.07.2013 Rechnungsnr./ -datum 23061336 / 07.07.2013 |
Language | |
InvoiceNumberHighConfidenceLabels InvoiceNumberLowConfidenceLabels |
The InvoiceNumberHighConfidenceLabels list contains fragments of text that unambiguously identify the field, such as "Invoice number" and "Document number". More ambiguous fragments of text such as "No." and "Bill" are listed in InvoiceNumberLowConfidenceLabels. | Language | |
DueDateLabels | Possible names of the DueDate field in the language. | Language | |
DeliveryDateLabels | Possible names of the DeliveryDate field in the language. | Language | |
InvoiceDateLabelsNearCity | Any fragment of text that may be used to separate the name of a city and the date if they are in the same line, such as a comma. | Language | |
InvoiceDateLabelsNearInvoiceNumber | Any fragment of text that may be used to separate the number of the invoice and the date if they are in the same line. | Language | |
InvoiceDateHighConfidenceLabels InvoiceDateLowConfidenceLabels |
The InvoiceDateHighConfidenceLabels list contains fragments of text that unambiguously identify the field, such as "Invoice date" and "Document date". More ambiguous fragments of text such as "Tax Date" and "Tax Point" are listed in InvoiceDateLowConfidenceLabels. | Language |
Keywords for the Amounts group of fields
ABBYY FlexiCapture is capable of determining when a word is part of a word combination. For example, if you add the keywords total and total netto, and the image contains total netto, it will be identified as total netto and not total.
- The program allows for up to 3 recognition errors in keywords.
- This limitation can lead to errors. For example, the program may mix up the words brutto and netto. To avoid these errors, if you add a value (such as total netto) to AmountTotalNettoLabels, add the corresponding value (such as total brutto) to AmountTotalLabels.
Field group | Located Elements | Description | Specified in the properties of |
---|---|---|---|
Amounts |
AmountTotalHighConfidenceLabels AmountTotalLowConfidenceLabels |
The captions of the Total field. It is advisable to place captions that occur only with the total amount on the invoice in the HighConfidence group. Captions that may occur with other amounts should be placed in the LowConfindence group. |
Language |
AmountTotalNettoLabels | The keyword that may be to the left or on the top of the Total Netto field | Language | |
AmountTotalTaxLabels | The keyword that may be to the left or on the top of the Total Tax field | Language, Country (the Keywords tab) | |
ReversedChargeKeywords | Words that can be used to indicate "Reversed Charge" in the language | Language | |
Tax Rates | Keywords for tax rates | Country (the Tax Rates tab) | |
Currency | Keywords or characters that indicate currencies. | Country (the Currency tab) |
Note: It is usually a bad idea to add the same word to Total Netto and Total.
Keywords for the Line Items group of fields
Keywords are used to find table titles that contain invoice fields, and for finding specific columns in tables. Words that are frequently used in titles of columns that correspond to Located Elements work best in this capacity.
If the same word can be found in titles of different columns, we recommend adding it to the Located Elements of these columns. In this case the program will be able to tell the columns apart by the contents of cells in these columns.
If the program still fails to tell the columns apart, or if the word is used frequently in one column and infrequently in other columns, leave the word in the Located Element of the column where it is used the most frequently and remove it from the Located Elements of the other columns.
- The program allows for up to 3 recognition errors in keywords.
- Keywords cannot take up more than one line.
- The contents of a column must be located directly beneath the title of the column.
- Keywords on images must be surrounded by spaces, commas or periods. So Quantity/Anzahl is one keyword, and Quantity / Anzahl is two.
If a keyword may be written in more than one way, add all of the different ways it can be written, for example Quantity/Anzahl, Quantity and Anzahl.
Field group | Located Elements | Description | Specified in the properties of |
---|---|---|---|
Line Items | LineItemsArticleNumberLabels | Keyword for the Article Number column | Language |
LineItemsArticleNumberBULabels | Keyword for the Article Number BU column | Language | |
LineItemsCurrencyLabels | Keyword for the Currency column | Language | |
LineItemsDeliveryDateLabels | Keyword for the Delivery Date column | Language | |
LineItemsDescriptionLabels | Keyword for the Description column | Language | |
LineItemsDiscountAmountLabels | Keyword for the Discount Amount column | Language | |
LineItemsDiscountPercentageLabels | Keyword for the Discount Percentage column | Language | |
LineItemsMaterialNumberBULabels | Keyword for the Material Number BU column | Language | |
LineItemsMaterialNumberLabels | Keyword for the Material Number column | Language | |
LineItemsOrderDateLabels | Keyword for the Order Date column | Language | |
LineItemsOrderNumberLabels | Keyword for the Order Number column | Language | |
LineItemsPositionLabels | Keyword for the Position column | Language | |
LineItemsQuantityLabels | Keyword for the Quantity column | Language | |
LineItemsQuantityOrderedLabels | Keyword for the Quantity Ordered column | Language | |
LineItemsQuantityUndeliveredLabels | Keyword for the Quantity Undelivered column | Language | |
LineItemsSubtotalVariants | Reserved | Language | |
LineItemsTotalPriceBruttoLabels | Keyword for the Total Price Brutto column | Language | |
LineItemsTotalPriceNettoLabels | Keyword for the Total Price Netto column | Language | |
LineItemsUnitsOfMeasureLabels | Keyword for the Units of Measure column | Language | |
LineItemsUnitsOfMeasureVariants | Possible field values in the Unit Of Measure column | Language | |
LineItemsUnitPriceDenominatorLabels |
Captions in the column header which is used as a multiplier for the Unit Price column when computing the equality for an invoice item: Unit Price * Denominator * Quantity = Total Netto. For instance, if UnitPrice is price per unit and Quantity indicates the number of packs, the Boolean column UnitPriceDenominator will correspond to a column that indicates the number of units in one pack. |
Language | |
LineItemsUnitPriceLabels | Keyword for the UnitPrice column | Language | |
LineItemsVATAmountLabels | Keyword for the VAT Amount column | Language | |
LineItemsVATCodeLabels | Keyword for the VAT Code column | Language | |
LineItemsVATPercentageLabels | Keyword for the VAT Percentage column | Language |
Note: The quality of keyword detection depends on the quality of full-text recognition.
You can change the pre-recognition mode on the FlexiLayout tab of the Document Definition Properties dialog box. The higher the quality of pre-recognition, the more time it will take.
18.06.2023 17:47:23