- Introducing ABBYY FlexiCapture
- Installing and running the program
-
Program settings
- Administrator and Operator modes
- Creating a project
-
Document Definitions
- Creating Document Definitions for forms
- Creating Document Definitions for semi-structured and unstructured documents
- Creating Document Definitions for documents that do not require automatic data extraction
- Document sets
-
Document Definition fields
- Text entry fields
- Checkmarks
- Checkmark groups
- Barcodes
- Pictures
- Tables
- Field group
- Service fields
- Index fields
- Link to an existing field
- Fields without a region
- Creating a field with a non-rectangular region
- Fields with several regions
- Fields with several instances
- How to change a field name
- Copying, moving, deleting fields
- Exclusion of a region from recognition
- Document Definition Wizard
- Editing and publishing a Document Definition
- Multipage document assembly
- Document Definition properties
- Properties of a Document Definition section
- Rule validation
- Export settings
- Configuring how data is displayed in the document window
- Testing Document Definitions
- Localizing a Document Definition
- Classification
- Field extraction training
- Operating a configured project
-
ABBYY FlexiCapture for Invoices
- Features of ABBYY FlexiCapture for Invoices
- How to capture invoices
-
How to set up an invoice capture project
- Country and language settings
- Using vendor and business unit databases
- Data export settings
- The status of documents in ABBYY FlexiCapture for Invoices projects
- Training ABBYY FlexiCapture for Invoices
- Rules
- Capturing additional invoice fields
- Purchase order matching
- Enabling additional program features for operators
- Using multiple Document Definitions
- Editing invoice processing settings in XML files
- Updating the built-in Document Definition
- Selecting the right tax system
- Specifications
- Capturing receipts
- Capturing purchase orders
- Using NLP to process unstructured documents
- ABBYY FlexiCapture interface
-
Appendix
-
Using scripts in ABBYY FlexiCapture
- Specifics of scripts written in .Net languages
- External assemblies
- Object model
-
Scripts for customizing processing stages
-
Types of scripts
- Script rule
- Autocorrection script
- Export script
- User script (custom action)
- Document assembly script
- Custom recognition script
- Data set update script
- Data set validation scripts
- Document classification script
-
Event handlers
- Batch created
- Batch deleted
- Batch parameter change
- Batch structure change (page added/page deleted/document added/document deleted)
- Pages moved
- Batch opened/closed
- Batch integrity check
- Document parameter changed
- Document state changed
- Export completed
- Script that is run after rule checks
- Before matching
- Field verification request
-
Objects
- IActionResult
- IAssemblingError
- IAssemblingErrors
- IBatch
- IBatchCheckResults
- IBatchItem
- IBatchItems
- IBinarizationParams
- IBoxedBoolean
- IBatchTypeClassifier
- IBatchTypeClassifierResult
- ICharacterParams
- ICharactersParams
- ICheckmarkGroupValue
- ICheckmarkValue
- IDataSet
- IDataSetQuery
- IDataSetRecord
- IDocument
- IDocuments
- IDocumentDefinitionInfo
- IDocumentDefinitionInfoArray
- IDocumentExportResults
- IDocumentsExportResults
- IEditablePictureObject
- IExportFieldsToRedact
- IExportImageSavingOptions
- IField
- IFieldRegion
- IFieldRegions
- IFields
- IFlexiCaptureTools
- ILocalContrastParams
- IMatchedSectionInfo
- IMatchingInfo
- IPage
- IPageClassificationResult
- IPages
- IPictureObject
- IPictureObjectsInfo
- IPrincipal
- IPrincipals
- IProcessingCallback
- IProject
- IProperties
- IProperty
- IPropertyModificationInfo
- IRecordCheckResult
- IRecordset
- IRect
- IRects
- IRoutingRuleResult
- IRuleContext
- IRuleError
- IRuleErrors
- IRuleTag
- IRuleTags
- IScriptBinaryAttributes
- IScriptDefinitionContext
- ISectionDefinitionInfo
- ISectionDefinitionInfoArray
- IShadowsHighlightsParams
- IStageInfo
- IUserAttachment
- IUserAttachments
- IUserSessionInfo
- IValue
- IVARIANTArray
- TAssemlingErrorType
- TBatchItemType
- TColorToFilter
- TExportFieldType
- TExportType
- TImageCompressionType
- TPageClassificationType
- TPdfAVersion
- TPdfDocumentInfoType
- TPdfTextSearchAreaType
- TPrincipalType
- TProcessingPriority
- TPropertyType
- TRuleErrorType
- TStateType
- Sample scripts
- Internal names of recognition languages
-
Types of scripts
-
Scripts for processing interface events
-
Event handlers
- On Document Closed
- On Project Closed
- On Activate Document
- On Field Control Activate
- On Return From Task
- On User Command
- On Field Control Deactivate
- On Closing Document
- On Closing Project
- On Region Change
- On Task Window Mode Changed
- On Region Control Draw
- On Open Document
- On Task Window Create
- On Text Field Validating
-
Objects
- IBoolean
- IBoxedFieldControl
- IDocumentEditor
- IDocumentItem
- IDocumentItems
- IDocumentsCollection
- IDocumentsWindow
- IDrawContext
- IErrorControl
- IErrorControls
- IErrorsWindow
- IFieldControl
- IFieldRegionControl
- IFieldRegionControls
- IFormWindow
- IImageWindow
- IMainMenu
- IMainWindow
- IMenu
- IMenuItem
- IPageControl
- IPageItem
- IPageItems
- IPagesCollection
- IPoint
- ISelection
- IShellRational
- IShellRect
- IShellRects
- ITaskWindow
- ITextEditor
- IToolbar
- IToolbarButton
- IToolbars
- TCommandBarType
- TCommandID
- TDockingType
- TDocumentState
- TErrorType
- TSelectionType
- TTaskWindowMode
- TTextSize
- TUserRole
- TWorkWindowType
-
Event handlers
- Creating a machine-readable form
- Hot keys
- Additional options
- ABBYY FlexiCapture sample projects
- Supported recognition languages
- Supported classifier languages
- Fonts for correct characters rendering
- Processing PDF files
- Supported text types
- Supported barcode types
- Supported input formats
- Export file formats
- Date formats
- Alphabet used in regular expressions
- Third-party technologies
- Glossary
- Technical support
- How to buy ABBYY FlexiCapture
- End-User License Agreement (EULA)
-
Using scripts in ABBYY FlexiCapture
Training while processing documents
ABBYY FlexiCapture for Invoices lets you improve recognition quality while processing documents. If the program fails to detect the correct location of a field on a document image, an Operator can specify the correct location and the program will use it when recognizing other documents.
Training is only available if ABBYY FlexiCapture can reliably identify the vendor by finding the corresponding record in a vendor database. If you have no vendor databases but still want to use field training, you can accumulate company information by adding records to your data sets while capturing invoices. For more information, see Looking up vendors and business units in the database.
This article explains how to train ABBYY FlexiCapture for Invoices using the locally-installed version of the Verification Station, and covers some training-related issues that Operators need to know about.
- Collect a batch of invoices (e.g. the invoices processed within the past month) and start feeding them to the program. See How to capture invoices.
- Once the documents are fed to the program, they are automatically recognized (this happens only if the Recognize added images automatically option is enabled on the Document Processing tab of the Options dialog; to open this dialog, click Tools → Options...) and the data are checked by means of validation rules.
- If the status of a recognized invoice is other than Valid or if you have other reasons to believe that the program failed to detect some of the fields, open the document in the document editor.
- Review the document form. The Vendor group of fields must be filled out correctly.
More...
Training is done independently for each Document Variant. Invoices from the same vendor a considered to belong to the same Document Variant. If the vendor is detected incorrectly, the wrong Document Variant will be selected for an invoice during training. If the program fails to detect the vendor, select the right vendor using the Vendor Lookup feature. If you can't find the vendor in the database, type in the name manually as it appears on the image and save it to the database by clicking the Save.
Depending on your project's settings, you may also have to specify the unique ID of an invoice's vendor in order to enable the program to train on that invoice. To do this, type the unique ID in the VATID field (this field may have a different name in some localizations of projects). The VATID is a unique identification number assigned to companies for value-added taxation purposes.
If documents originating from the same vendor have widely varying layouts, you should use the clustering feature. For details, see Training with clustering.
- Training will only be successful if the regions of all the fields are marked up correctly, so make sure that the regions match the actual locations of their respective fields on the image. See Training line items for more information on how to mark up line items on an invoice.
To achieve this, in the image window of the document editor, adjust the regions or draw regions for those fields which the program failed to detect.
After that, the program will analyze the document. If the mark up of the field regions was modified and the training for this vendor is not prohibited, the document will be added to the batch.
How to change the region of a field
- Position the mouse pointer in a desired field on the data form, find the corresponding region on the image (it will be highlighted in blue), and click it (or draw a rectangle with the mouse);
- Position the mouse pointer on a desired region on the image (it will be highlighted in blue), click it (or draw the region with the mouse), and then select the corresponding field from the drop-down list that opens;
- Adjust the position of a region on the image by moving its boundaries with the mouse;
- Delete an incorrectly located region from the image: position the mouse pointer on its rectangle and when a red cross appears in the top right corner, click the red cross. The markup of the region will be deleted. Now create a new region for this field in the right location;
- On the data form, start typing a value into a field. A drop-down list will be displayed listing the words captured from the image that resemble the word that you are typing. Select the right word from the list, and the position of the word on the image will become the region of the field.
- All the fields of the invoice will be used for training purposes, not just those whose markup you have added or modified.
- Repeat steps 4-6 for the next document.
- When the third and subsequent invoices of the same vendor are added to the batch, the program starts the training process. The program will either train a special FlexiLayout (a FlexiLayout Variant) or suggest that a user gathers more examples (in this case move to the next document and go back to step 4).
If the FlexiLayout for the alternative has been successfully trained, it will be used with the next vendor invoice that determines this invoice variant. After the recognition, field regions will be imposed on an invoice image based on the training results.
If a new image is added to the batch, the program determines the quality of FlexiLayout application for the variant. If the added image deteriorates the quality of the application of field regions, it will not be used. Otherwise, it will be used for testing. - Add a few more invoices from the vendor whose Document Variant has been trained and recognize them. Then open the newly added invoices one by one in the document editor to check if the regions are marked up correctly. If all the regions are located correctly, no additional training is required.
If you are not satisfied with the results, continue training the program on invoices from the given vendor (repeat steps 4-6). Now each time, the training process will be started. If the training is successful, a new FlexiLayout Variant will be created.
02.03.2021 8:10:42