- Introducing ABBYY FlexiCapture
- Installing and running the program
- ABBYY FlexiCapture architecture
-
Program settings
- ABBYY FlexiCapture Setup
- Multitenancy
- Creating a project
-
Document Definitions
- Creating fixed Document Definitions
- Creating Document Definitions for semi-structured documents
- Document Definitions without automatic fields extraction
- Document sets
-
Document Definition fields
- Text entry fields
- Checkmarks
- Checkmark groups
- Barcodes
- Pictures
- Tables
- Field group
- Service fields
- Index fields
- Link to an existing field
- Fields without a region
- Creating a field with a non-rectangular region
- Fields with several instances
- Fields with several regions
- How to change a field name
- Copying, moving, deleting fields
- Exclusion of a region from recognition
- Document Definition Wizard
- Editing and publishing a Document Definition
- Creating Document Definitions
- Document Definition properties
- Properties of a Document Definition section
- Rule validation
- Export settings
- Configuring data presentation in the document window
- Testing Document Definitions
- Localizing a Document Definition
- Classification
- Field extraction training
- Operating a configured project
-
ABBYY FlexiCapture for Invoices
- Features of ABBYY FlexiCapture for Invoices
- How to capture invoices
-
How to set up an invoice capture project
- Country and language settings
- Connecting vendor and business unit databases
- Data export settings
- The status of documents in ABBYY FlexiCapture for Invoices projects
- Training ABBYY FlexiCapture for Invoices
- Rules
- Capturing additional invoice fields
- Purchase Order Matching
- Enabling additional program features for operators
- Using multiple Document Definitions
- Editing invoice processing settings in XML files
- Updating the Document Definition for invoices
- Tax systems
- Specifications
- Capturing receipts
- Capturing purchase orders
- Using NLP to process unstructured documents
- ABBYY FlexiCapture interface
-
Appendix
-
Using scripts in ABBYY FlexiCapture
- Specifics of scripts written in .Net languages
- External assemblies
- Object model
-
Scripts for customizing processing stages
-
Types of scripts
- Script rule
- Autocorrection script
- Export script
- User script (custom action)
- Document assembly script
- Custom recognition script
- Stage rule
- Processing scripts
- Data set update script
- Data set validation scripts
- Document classification script
-
Event handlers
- Batch created
- Batch deleted
- Batch parameter change
- Batch structure change (page added/page deleted/document added/document deleted)
- Pages moved
- Batch opened/closed
- Batch integrity check
- Document parameter changed
- Document state changed
- Export completed
- Script that is run after rule checks
- Before matching
- Field verification request
-
Objects
- IActionResult
- IAssemblingError
- IAssemblingErrors
- IBatch
- IBatchCheckResults
- IBatchItem
- IBatchItems
- IBatchTypeClassifier
- IBatchTypeClassifierResult
- IBinarizationParams
- IBoxedBoolean
- ICharacterParams
- ICharactersParams
- ICheckmarkGroupValue
- ICheckmarkValue
- IDataSet
- IDataSetQuery
- IDataSetRecord
- IDocument
- IDocuments
- IDocumentExportResults
- IDocumentsExportResults
- IDocumentDefinitionInfo
- IDocumentDefinitionInfoArray
- IEditablePictureObject
- IExportFieldsToRedact
- IExportImageSavingOptions
- IField
- IFieldRegion
- IFieldRegions
- IFields
- IFlexiCaptureTools
- ILocalContrastParams
- IMatchedSectionInfo
- IMatchingInfo
- IPage
- IPageClassificationResult
- IPages
- IPictureObject
- IPictureObjectsInfo
- IPrincipal
- IPrincipals
- IProcessingCallback
- IProject
- IProperties
- IProperty
- IPropertyModificationInfo
- IRecordCheckResult
- IRecordset
- IRect
- IRects
- IRoutingRuleResult
- IRuleContext
- IRuleError
- IRuleErrors
- IRuleTag
- IRuleTags
- IScriptBinaryAttributes
- IScriptDefinitionContext
- ISectionDefinitionInfo
- ISectionDefinitionInfoArray
- IShadowsHighlightsParams
- IStageInfo
- IUserAttachment
- IUserAttachments
- IUserSessionInfo
- IValue
- IVARIANTArray
- TAssemlingErrorType
- TBatchItemType
- TColorToFilter
- TExportFieldType
- TExportType
- TImageCompressionType
- TPageClassificationType
- TPdfAVersion
- TPdfDocumentInfoType
- TPdfTextSearchAreaType
- TPrincipalType
- TProcessingPriority
- TPropertyType
- TRuleErrorType
- TStateType
- Sample scripts
- Internal names of recognition languages
-
Types of scripts
-
Scripts for processing interface events
-
Event handlers
- On Document Closed
- On Project Closed
- On Activate Document
- On Field Control Activate
- On Return From Task
- On User Command
- On Field Control Deactivate
- On Closing Document
- On Task Close
- On Closing Project
- On Region Change
- On Task Window Mode Changed
- On Open Document
- On Task Window Create
- On Task Reject
- On Region Control Draw
- On Task Send To Stage
- On Text Field Validating
-
Objects
- IBoolean
- IBoxedFieldControl
- IDocumentEditor
- IDocumentItem
- IDocumentItems
- IDocumentsCollection
- IDocumentsWindow
- IDrawContext
- IErrorControl
- IErrorControls
- IErrorsWindow
- IFieldControl
- IFieldRegionControl
- IFieldRegionControls
- IFormWindow
- IImageWindow
- IMainMenu
- IMainWindow
- IMenu
- IMenuItem
- IPageControl
- IPageItem
- IPageItems
- IPagesCollection
- IPoint
- ISelection
- IShellRational
- IShellRect
- IShellRects
- ITaskWindow
- ITextEditor
- IToolbar
- IToolbarButton
- IToolbars
- TCommandBarType
- TCommandID
- TDockingType
- TDocumentState
- TErrorType
- TSelectionType
- TTaskWindowMode
- TTextSize
- TUserRole
- TWorkWindowType
-
Event handlers
- User scripts for the Web Verification Station
- Creating a machine-readable form
- Hot keys
- Additional options
- Description of Processing Server commands
- ABBYY FlexiCapture sample projects
- Supported recognition languages
- Supported classifier languages
- Fonts for correct characters rendering
- Supported text types
- Supported barcode types
- Supported input formats
- Processing PDF files
- Export file formats
- Date formats
- Alphabet used in regular expressions
- Patents
- Third-party technologies
- Glossary
- Technical support
- How to buy ABBYY FlexiCapture
- End-User License Agreement (EULA)
-
Using scripts in ABBYY FlexiCapture
Configuring auto-learning for field extraction
Auto-learning enables the system to learn from the operators' decisions during document processing in order to improve the detection of document fields.
When the system fails to find a field on a document, an operator may intervene and indicate the correct location of the field. Once the recognized and corrected documents are successfully exported, the system uses the corrections made by the operator as learning input.
Configuring auto-learning
To configure auto-learning, complete the following steps:
- Create a Document Definition.
- In the section properties of the Document Definition, select Allow field location training.
- Create the necessary fields in the section. Select Can have region in the properties of each field.
- Save and publish the Document Definition.
- In the batch type properties dialog box, click the Workflow and enable the Training stage.
To configure auto-learning for documents of the same type whose appearance varies greatly from one document to another, create variants for each particular field layout and then train a classifier to distinguish the variants. For more about variants, see Variable field locations on documents that belong to the same type.
Additional steps required to configure variants
To enable the system to use variants in auto-learning, complete the following steps:
- Add section variants using one of the following three methods:
- Create variants manually. To do this, click the Data Sets tab in the section properties and then click the View.... button. Then click the Add... button to add variants.
- Load variants from a database. To do this, click the Data Sets tab in the section properties and then click the Set Up... button. From the drop-down list, select Database as the data source.
- Create variants using a script. To do this, click the Data Sets in the section properties and then click the Set Up... button. From the drop-down list, select Script as the data source.
- Save and publish the Document Definition.
- Train a classifier on the newly created variants:
- Switch to Open Classifier Training Batches mode and load document images into a new batch.
- Assign a reference class to each document, using variants as separate classes:
- Click Set Class... → Add... → Add...
- Select Specify variant.
- Select a variant from the list.
- Train a classifier by clicking (Project → Classification Training → Train).
When working with the training results, you may need to check which variant was assigned to a document and edit it if necessary. To display the IDs of the variants on the form, create a service field. For details, see Enabling operators to change variants.
Note: Field extraction training can also be done by the administrator if a project has to be trained before the operators start working on it.
Once the Document Definition is set up by the administrator, the system will automatically learn from the operators' corrections made on the Verification Stations.
The auto-learning procedure
Documents whose field locations have been verified and corrected by the operators are placed into a training batch.
The documents are matched against the current version of the trained FlexiLayout. If all the fields are found correctly, there is no need to retrain the FlexiLayout.
Note: It may so happen that the FlexiLayout finds the fields correctly, but the operator had to change them. Documents were processed using an old or untrained version of the FlexiLayout. While the documents were awaiting verification, the system trained the FlexiLayout on some other documents. As a result, the given documents are now processed correctly.
In this case, the documents are kept in the training batch with For testing status. They will be used for regression tests to prevent future versions of the FlexiLayout from degrading.
If a trained FlexiLayout is applied and some of the field regions do not match, the documents will be used in training a new version of the FlexiLayout. They will be assigned For training status.
Training results is a new version of the FlexiLayout. To compare the new version with the previous version, both are applied to he documents in the training batch that have For training and For testing statuses. The system checks how well the detected fields regions match the layout that has been confirmed by the user. The FlexiLayout that yields the best match will be used in further document processing and the inferior version is deleted.
3/2/2021 8:10:42 AM