- Introducing ABBYY FlexiLayout Studio
- What's New In ABBYY FlexiLayout Studio 12
- Installing, running, and removing ABBYY FlexiLayout" Studio
- Program interface
- Projects
- Batches
- FlexiLayouts
- Multi-page FlexiLayout
- Pre-recognition
-
Elements
- Creating, copying, and deleting elements
- An overview of element properties
- Required and optional elements
- Element properties
- Search area
- Additional search constraints
- Units of measurement
- Fuzzy interval
- Specifying databases and text files in the FlexiLayout language
- Training elements
- Dependency browser
- Blocks
- Working with tables
-
Hypotheses and trees of hypotheses
- Hypotheses and optimizing the search
-
How hypotheses are generated and assessed
- Hypotheses for Static Text elements
- Hypotheses for Separator elements
- Hypotheses for White Gap elements
- Hypotheses for Barcode elements
- Hypotheses for Character String elements
- Hypotheses for Text Fragment elements
- Hypotheses for Date elements
- Hypotheses for Object Collection elements
- Hypotheses for Group elements
- Hypotheses for Phone elements
- Hypotheses for Currency elements
- Hypotheses for Table elements
- Hypotheses for Repeating Group elements
- Hypotheses for Labeled Field elements
- Hypotheses for Region elements
- Tree of hypotheses
- Debugging the FlexiLayout
- Classification
- Export
-
FlexiLayout language
- Basic concepts
-
Predefined types
- Void
- Logic
- String
- Int
- Distance
- XCoordinate
- YCoordinate
- Real
- Quality
- Area
- ImageObjectType
- DateFormats
- DayFormatVariants
- MonthFormatVariants
- YearFormatVariants
- XInterval
- YInterval
- DistInterval
- Rect
- FuzzyRect
- RectArray
- Region
- ImageObjectSet
- TextTypes
- BarcodeTypes
- BarcodeOrientations
- RecognitionMode
- Direction
- HorSearchAreaBound
- VertSearchAreaBound
- Hypothesis
- HypothesisInstances
- TableBlock
- TableBlockColumn
- TableBlockColumnArray
- TableHypothesis
- TableHypColumn
- TableColumnType
- CurrencyPositionTypes
- PageInterval
- PageArea
- PageEdge
- Page
- SearchAreaPageSetType
- IntArray
- StringArray
- IntFuzzyInterval
- DistFuzzyInterval
- AreaFuzzyInterval
- TextRotations
- Conversion of types
- Predefined constants
- Predefined variables
- Global functions
- Functions for working with named parameters
- Advanced pre-search functions
- Advanced post-search functions
- Specifying element properties
-
Hypotheses and their properties
- Object Collection hypothesis
- Character String hypothesis
- Static Text hypothesis
- Paragraph hypothesis
- Barcode hypothesis
- Date hypothesis
- Currency hypothesis
- Phone hypothesis
- Table hypothesis
- Table Column hypothesis
- Repeating Group hypothesis
- First Found hypothesis
- Labeled Field hypothesis
- Region hypothesis
- Hypotheses for all types of element
- Printing to debug
-
Tips and tricks
- Detecting dates in the case of low quality pre-recognition
- Setting multiple static text values. Search for static text with similar values
- Using Exclude to exclude elements
- Using Group elements to optimize FlexiLayout structure and search
- Searching for single-line Static Text elements
- Restricting the search area by means of RestrictSearchArea
- Searching for single-line fields of known or unknown format on documents of varying OCR quality
- Using Nearest and FuzzyQuality to search for elements
- Optimizing Group element search
- The "Optional" property of a Group element
- Searching for strings of digits
- Simplifying the FlexiLayout by using an auxiliary element and a null
- Describing text fields containing framed letters
-
Appendix
- Shortcut keys
- Properties of image objects detected during pre-recognition
- Supported input formats
- Barcodes supported in ABBYY FlexiLayout" Studio
- OCR languages supported in ABBYY FlexiLayout" Studio
- User dictionaries
- Alphabet used in regular expressions
- Extended regular expressions
-
Dialog boxes
- Options dialog box
- Fuzzy interval visual editor
-
Element Properties dialog box
- General tab
- Static Text tab
- Separator tab
- White Gap tab
- Barcode tab
- Character String tab
- Paragraph tab
- Date tab
- Object Collection tab
- Phone tab
- Currency tab
- Repeating Group tab
- Columns tab
- Order tab
- Header tab
- Footer tab
- Rows tab
- Label tab
- Field Position tab
- Field tab
- Search Constraints tab
- Relations tab
- Advanced tab
- Advanced for All Instances tab
- Errors tab
- End-User License Agreement
- Patents
- How to buy ABBYY FlexiCapture
- Third-party technologies
-
Tutorial
-
Sample 1
- Step 1: Creating a new project
- Step 2: Adding images to the batch
- Step 3: Setting the FlexiLayout properties
- Step 4: Pre-recognition
- Step 5: Viewing images and pre-recognition results
- Step 6: Analyzing pre-recognition results and selecting reference elements
- Step 7: Creating a form identifier
- Step 8: Testing the identifier element
- Step 9: Adjusting the properties of the identifier element
- Step 10: Describing the YOUR PLANET NAME field
- Step 11: Describing the YOUR PLANET NAME field with a PlanetNameHeader element
- Step 12: Describing the YOUR PLANET NAME field with a PlanetName element
- Step 13: Testing the YOUR PLANET NAME field
- Step 14: Describing the YOUR PLANET NAME field with a PlanetName block
- Step 15: Describing the NAME field
- Step 16: Describing the YOUR SPACESHIP NUMBER field
- Step 17: Describing the DATE YOU ARRIVED AT THE EARTH field
- Step 18: Describing the YOUR IDENTITY NUMBER ON THE PARTY field
- Step 19: Describing the ANY TEXT field
- Step 20: Describing the YOUR PHOTO IN FANCY DRESS field
- Step 21: Exporting the FlexiLayout
- Step 22: Opening the FlexiLayout in ABBYY FlexiCapture
-
Sample 2
- Step 1: Creating a new project
- Step 2: Adding images to the batch
- Step 3: Setting the FlexiLayout properties
- Step 4: Pre-recognition
- Step 5: Viewing images and pre-recognition results
- Step 6: Creating a document identifier
- Step 7: Testing the identifier element
- Step 8: Specifying the order in which the field Recipe # and the name for the recipe must be detected
- Step 9: Describing the Recipe # field
- Step 10: Creating a Recipe element
- Step 11: Creating a RecipeNumber element
- Step 12: Creating a RecipeNumber block
- Step 13: Describing the field which contains the name of the recipe
- Step 14: Describing the Ingredients field
- Step 15: Describing the field which contains the cooking instructions and the field which contains the cooking time
- Step 16: Creating a CookingTimeHeader element
- Step 17: Creating a CookingTime element
- Step 18: Creating a CookingTime block
- Step 19: Creating an InvertedHeader element
- Step 20: Describing the Cooking field
- Step 21: Creating a Serves element
- Step 22: Creating a Portions element
- Step 23: Describing the Cooking field
- Step 24: Creating a CookingDescription block
- Step 25: The FlexiLayout is ready
-
Sample 3
- Step 1: Preparatory settings
- Step 2: Visual analysis of the images and pre-recognition results
- Step 3: Blocks
- Step 4: Analyzing the images to decide in which order elements must be detected
- Step 5: Detecting the name of the Delivery Address field with a kwDeliveryAddress element
- Step 6: Detecting the name of the Invoice Number field with a kwInvoiceNumber element
- Step 7: Detecting the name of the Invoice Date field with a kwInvoiceDate element
- Step 8: Describing the Invoice Number field with an InvoiceNumber element
- Step 9: Describing the Invoice Date field: the grDate, InvoiceDate, and InvoiceDateAsString elements
- Step 10: Creating an grAddress element of type Group
- Step 11: Detecting the right boundary of the Delivery Address field with a wgAddressRight element
- Step 12: Describing the Delivery Address field with a DeliveryAddress element
- Step 13: Further analysis of the images
- Step 14: Detecting the auxiliary horizontal separator with an hsTableHeaderTop element
- Step 15: Analyzing the search constraints for column names with a TableHeader element of type Group
- Step 16: Detecting the name of the Quantity column with a kwQuantity element
- Step 17: Detecting the name of the Unit Price column with a kwUnitPrice element
- Step 18: Detecting the name of the Total column with a kwTotal element
- Step 19: Creating a InvertedHeader element
- Step 20: Describing the Footer group with a Footer element of type Group
- Step 21: Describing the footer of the table with a kwFooter element
- Step 22: Describing the name of the Total field with the kwTotal element
- Step 23: Detecting the name of the Country field with a kwOrigin element
- Step 24: Describing the Country field with a Country element
- Step 25: Detecting the TotalQuantity and TotalAmount fields with TotalQuantity and TotalAmount elements
- Step 26: Detecting the Table element with an InvoiceTable element
- Step 27: Exporting the FlexiLayout into ABBYY FlexiCapture
-
Sample 4
- Step 1: Preparatory settings
- Step 2: Viewing the images and pre-recognition results
- Step 3: Blocks
- Step 4: Analyzing the images to determine the order in which the elements should be detected
-
Step 5: Document header and InvoiceHeader group
- Step 5.1: Name of Invoice Number field, kwInvoiceNumber element
- Step 5.2: Name of Delivery Address field, kwDeliveryAddress element
- Step 5.3: Name of Invoice Date field, kwInvoiceDate element
- Step 5.4: Invoice Number field, InvoiceNumber element
- Step 5.5: Invoice Date field, grDate, InvoiceDate, and InvoiceDateAsString elements
- Step 5.6: Delivery Address field, grAddress, wgAddressAbove, and DeliveryAddress elements
- Step 6: Document Footer, InvoiceFooter group
-
Step 7: Table column names, TableHeader group
- Step 7.1: Name of Designation column, kwDesignation element
- Step 7.2: Name of ExtraQuantity column, ExtraQtyTag element
- Step 7.3: Name of Quantity column, kwQuantity element
- Step 7.4: Name of UnitPrice column, kwUnitPrice element
- Step 7.5: Name of Total column, kwTotal element
- Step 7.6: Name of Reference column, kwReference element
- Step 7.7: Name of Sales column, kwSales element
- Step 7.8: Name of Unit column, kwUnit element
- Step 8: Table element, InvoiceTable element
- Step 9: TotalAmount field, SumGroup group element
- Step 10: Company field, CompanyGroup group element, Company element
- Step 11: Exporting the FlexiLayout into FlexiCapture
-
Sample 1
- Technical support
- Glossary
English (English) - Change language
Sample 4. Step 4: Analyzing the images to determine the order in which the elements should be detected
At this stage, you have to establish the following:
- Is there any pattern or method in the arrangement of the fields on the images?
- Which elements can be relied on to look for the data fields?
- In what order should we look for the elements? (This is important, because at each subsequent step we can only rely on the elements of the previous step.)
As we are dealing with a multi-page document, we must first establish which objects can be used to identify the first and last pages of the document. These objects can be described by means of special compound Header and Footer elements.
- The Header element must match only the first page of the document.
If a project also contains documents of other types, this elements will also be used as an identifier (i.e. a unique characteristic identifying this document type).
- The Footer element must match the last page of the document only. We recommend creating required subelements in this group to prevent the Footer element from being matched with any other document page.
Once you have analyzed the images, you will notice that:
- On the first page, there is a group of fields that consists of InvoiceNumber, InvoiceDate, DeliveryAddress. The name of the InvoiceNumber field always goes at the beginning of each document, whereas the InvoiceDate and DeliveryAddress are not always present.
- The InvoiceNumber and InvoiceDate fields can be located either to the right of their corresponding names or below them.
- For DeliveryAddress, we should also look either to the right or below the corresponding name, having limited the search area. Additionally, we will need an element to restrict the search area from below.
- Since on some of the images these fields have no values, you can speed up the matching process by specifying the following condition: do not look for the value of a field if the field's name have not been detected.
- We can use this group of fields as an identifier for our document. We will describe these fields as part of a compound Header element named InvoiceHeader.
- The last page of the document has the words TOTAL AMOUNT MUST, Carried over, Total CHF, TOTAL below the table. However, these words may also occur elsewhere on the document (e.g. in the name or in the body of the table). To find these words, we will need to use additional reference elements (e.g. table column names). These reference elements will help us restrict the search area.
- The elements describing the last page of the document will be part of a compound Footer element named InvoiceFooter.
- For the Footer element to match only the last page of the document, it must contain a required element. As the words identifying the last page (see 3 above) occur on each last page of each document, we will make the element that describes them a required element.
- The table (name it InvoiceTable) starts on the first page and ends on the last one. Additionally, the table is always preceded by the column names on the first page. To identify the end of the table (on the last page), we will use an auxiliary element (e.g. the required element from the InvoiceFooter group).
Note.The multitude of all pages of a document is called a multi-page canvas. A multi-page canvas is formed by joining all the pages of a document, top-down, without any gaps, the left boundary of all the pages lying on the same axis, which goes through the point (0, 0). The order in which the pages are joined together is determined by the order of the pages in the batch, therefore, we can only specify the start of the table (its header on the first page) and end of the table (its footer on the last page). The program will search for the table in the entire document, i.e. on the entire multi-page canvas.
- We will look for the name of the company in the Company field always on the first page and always in the first upper third of the page.
- The name of the Total Amount field is always located on the last page, below the table. The value of the field is located either to the right of the name or below it.
1/14/2021 2:17:19 PM