English (English)

Sample 4. Step 5.5: Invoice Date field, grDate, InvoiceDate, and InvoiceDateAsString elements

Once you have examined the images, you will notice that:

  • On some of the documents, the Invoice Date field is located to the right of the field name and on others it is located beneath the name. Therefore, we will limit the search area to rectangles to the right and beneath the name.
  • We will use a Date element to search for the date. Additionally, we will specify the following condition: if the field name is not detected, do not search for the field data.
  • Quite often, dates may be recognized unreliably. This may be due to scanning defects, invalid date formats, etc. Therefore, we will specify an additional Character String element in case the Date element fails to find any date.

To specify the settings common to all the elements, we will create an element of type Group.

To create a grDate element of type Group:

  1. In the InvoiceHeader element, create an element of type Group and name it grDate.
  2. Click the Advanced tab and specify additional search constraints: limit the search area to the array of rectangles consisting of one rectangle to the right of the field name and rectangle below the field name with some offset. In the FlexiLayout language this constraint can be written as follows:
    RectArray DataRegion;
    Let r1= Rect (kwInvoiceDate.Rect.Right, kwInvoiceDate.Rect.Top -20dt, kwInvoiceDate.Rect.Right + 650dt, kwInvoiceDate.Rect.Bottom + 50dt);
    Let r2 = Rect (kwInvoiceDate.Rect.Left - 150dt, kwInvoiceDate.Rect.Bottom, kwInvoiceDate.Rect.Right + 100dt, kwInvoiceDate.Rect.Bottom + 100dt);

    DataRegion = RectArray (r1);
    DataRegion.Add (r2);
    RestrictSearchArea (DataRegion);

To create an InvoiceDate element:

  1. In the InvoiceHeader.grDate element, create an element of type Date and name it InvoiceDate.
  2. Click the Date tab.
  3. Specify all the possible date formats for the InvoiceDate element:

    Show me...

  1. On some of the images, the search area of the Invoice Date field will include the already detected kwInvoiceNumber and InvoiceNumber elements. To prevent the program from considering the values of these elements as hypotheses for the Invoice Date field, exclude these elements from the search area:
    • Click the Add... button next to the Exclude regions of elements field.
    • Select kwInvoiceNumber from the list of elements.
    • Click OK. The string SearchElements.InvoiceHeader.kwInvoiceNumber will appear in the Exclude regions of elements field.

Repeat the above actions for the element SearchElements.InvoiceHeader.InvoiceNumber.

  1. Click the Advanced tab.
  2. The Invoice Date field is not a required element. However, if a document contains a date (in the Invoice Date field), there is always the corresponding field name on the document (described earlier by the kwInvoiceDate element). Therefore, you can specify an additional search condition in Advanced pre-search relations:Search for the image object only if the kwInvoiceDate has been detected. In the FlexiLayout language, this condition can be written as follows:
    If InvoiceHeader.kwInvoiceDate.IsNull
    Then DontFind();
  3. Temporarily exclude the InvoiceFooter element and match the FlexiLayout.

For poor quality images, when recognition results do not fit any of the standard parameters of the Date element, we will add an alternative element which will use more lax conditions to search for the Invoice Date field.

To create an InvoiceDateAsString element:

  1. In the InvoiceHeader.grDate element, create an element of type Character String and name it InvoiceDateAsString.
  2. Click the Character String tab.
  3. Specify the alphabet:
    ,-./0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ abcdefghijklmnopqrstuvwxyz
  4. Set the percentage of non-alphabet characters to 30%.
  5. In the Character count field, specify the fuzzy interval {-1, 8, 14, INF}for the length of the character string, assuming that possible values fall into the region of 8 to 14 characters. Any hypothesis outside this interval will be penalized.
  6. Set Max space length to 20. This limits the maximum length of space in the text string to 20 dots.
  7. For the other element properties, keep the default settings.
  8. Click the Advanced tab.
  9. Since we are going to search for the InvoiceDateAsString element only if the InvoiceDate is not detected on the image, specify the following condition in the Advanced pre-search relations field:Search for the image object only if the InvoiceDate element is not detected.In the FlexiLayout language, this condition can be written as follows:
    If Not InvoiceDate.IsNull Then DontFind;
  10. Specify an additional condition for InvoiceDate similar to the one above:Search for the image object only if thekwInvoiceDate element has been detected. Search for an image object closest to thekwInvoiceDate element. In the FlexiLayout language, this condition can be written as follows:
    If InvoiceHeader.kwInvoiceDate.IsNull Then DontFind;
    Nearest: InvoiceHeader.kwInvoiceDate;
  11. Specify the location of the InvoiceDate block as the rectangular region of the detected InvoiceDate or InvoiceDateAsString element increased by 5 dots vertically and horizontally. To do this, select the Expression option and type the following expression:
    Rect outputRect;
    if not InvoiceHeader.grDate.InvoiceDate.IsNull then
    outputRect = InvoiceHeader.grDate.InvoiceDate.Rect;
    else
    { outputRect = InvoiceHeader.grDate.InvoiceDateAsString.Rect;
    IsNull = InvoiceHeader.grDate.InvoiceDateAsString.IsNull;
    }
    OutputRegion = outputRect;
    OutputRegion.Inflate (5dt, 5dt);

12.04.2024 18:16:02

Please leave your feedback about this article

Usage of Cookies. In order to optimize the website functionality and improve your online experience ABBYY uses cookies. You agree to the usage of cookies when you continue using this site. Further details can be found in our Privacy Notice.