English (English) - Change language

Sample 3. Step 4: Analyzing the images to decide in which order elements must be detected

At this step, we are to decide:

  • Is there any pattern in the arrangement of the fields on the images?
  • Which elements can be used as reference elements for fields’ detection?
  • What sequence of search for elements is the best choice? (At each new step we can only refer to the above described elements.)

Let us analyze the available images.

  1. You will have noticed that the central part of the document contains a table which occurs on all the images.
    Note.Please note that in FlexiLayout Studio, a table is an image object made up of fragments that consist of rows and columns visually separated by separator lines or by white gaps. (See Table for more details.)
  2. Above the table there is a group of fields (which may be named the InvoiceHeader): Invoice Number, Invoice Date, and Delivery Address. Note that the field Invoice Number occurs on all the images and can be used as a identifier field, while the fields Invoice Date and Delivery Address are optional.
  3. Below the table there is a group of fields (which may be named the Footer): TotalQuantity, TotalAmount and Country. These fields also do not occur on all the images.

We will start to create our search for elements in the upper part of the document. To detect the upper fields, we will create a logical group uniting all the elements which are used to look for the fields Invoice Number, Invoice Date, and Delivery Address.

  1. Create an element of type Group and name it InvoiceHeader.
    The fields Invoice Number, Invoice Date, and Delivery Address are always located in the upper left corner. Moreover, their order is always the same: Delivery Address followed by Invoice Number followed by Invoice Date (provided they occur on the image). We will look for them in the same order.
  2. To describe search of the keywords for the names of the fields Invoice Number, Invoice Date, and Delivery Address, we will use elements of type Static Text. The InvoiceHeader element must contain the following elements:
    • kwDeliveryAddress element of the Static Text, which will correspond to the name of the field Delivery Address (for detailed instructions, see Step 5);
    • kwInvoiceNumber element of the Static Text, which will correspond to the name of the field Invoice Number (for detailed instructions, see Step 6);
    • kwDate element of the Static Text, which will correspond to the name of the field Invoice Date (for detailed instructions, see Step 7).
  3. As for the fields Number and Invoice Date, we are going to search for them in the same row as their corresponding names, to the right of the names.
    In the InvoiceHeader element, create the following elements:
    • InvoiceNumber element of type Character String, which will correspond to the field Invoice Number (for detailed instructions, see Step 8);
    • a Group element grDate to look for the field Invoice Date (for detailed instructions, see Step 9).

In the InvoiceHeader.grDate element, create:

  • InvoiceDate element of the Date type, which will correspond to the Invoice Date field in the case of good quality images (for detailed instructions, see Step 9);
  • InvoiceDateAsString element of type Character String, which will correspond to the Invoice Date field if the program fails to find the InvoiceDate element (for detailed instructions, see Step 9).
    Note.For more information on finding dates on poor quality images, see the Tips and Tricks section.
  1. The field Delivery Address has multiple lines, so we need an element of the Paragraph type to detect it. However, prior to creating this element, we should limit the search area as much as possible. For the right boundary of the search area, we will use element of the White Gap type. Afterwards, we will group all the elements which describe the location of the field Delivery Address into a Group element:
    • In the element InvoiceHeader, create a Group element and name it grAddress (for detailed instructions, see Step 10).

In the grAddress element, create the following elements:

  • an auxiliary element wgAddressRight of the White Gap type, bordering on the right the field Delivery Address (for detailed instructions, see Step 11);
  • an element DeliveryAddress of the Paragraph type, which will correspond to the field Delivery Address (for detailed instructions, see Step 12).

01.12.2020 7:03:59


Please leave your feedback about this article