- Introducing ABBYY FlexiCapture
- Installing and running the program
- ABBYY FlexiCapture architecture
-
Program settings
- ABBYY FlexiCapture Setup
- Multitenancy
- Creating a project
-
Document Definitions
- Creating fixed Document Definitions
- Creating Document Definitions for semi-structured documents
- Document Definitions without automatic fields extraction
- Document sets
-
Document Definition fields
- Text entry fields
- Checkmarks
- Checkmark groups
- Barcodes
- Pictures
- Tables
- Field group
- Service fields
- Index fields
- Link to an existing field
- Fields without a region
- Creating a field with a non-rectangular region
- Fields with several instances
- Fields with several regions
- How to change a field name
- Copying, moving, deleting fields
- Exclusion of a region from recognition
- Document Definition Wizard
- Editing and publishing a Document Definition
- Creating Document Definitions
- Document Definition properties
- Properties of a Document Definition section
- Rule validation
- Export settings
- Configuring data presentation in the document window
- Testing Document Definitions
- Localizing a Document Definition
- Classification
- Field extraction training
- Operating a configured project
-
ABBYY FlexiCapture for Invoices
- Features of ABBYY FlexiCapture for Invoices
- How to capture invoices
-
How to set up an invoice capture project
- Country and language settings
- Connecting vendor and business unit databases
- Data export settings
- The status of documents in ABBYY FlexiCapture for Invoices projects
- Training ABBYY FlexiCapture for Invoices
- Rules
- Capturing additional invoice fields
- Purchase Order Matching
- Enabling additional program features for operators
- Using multiple Document Definitions
- Editing invoice processing settings in XML files
- Updating the Document Definition for invoices
- Tax systems
- Specifications
- Capturing receipts
- Capturing purchase orders
- Using NLP to process unstructured documents
- ABBYY FlexiCapture interface
-
Appendix
-
Using scripts in ABBYY FlexiCapture
- Specifics of scripts written in .Net languages
- External assemblies
- Object model
-
Scripts for customizing processing stages
-
Types of scripts
- Script rule
- Autocorrection script
- Export script
- User script (custom action)
- Document assembly script
- Custom recognition script
- Stage rule
- Processing scripts
- Data set update script
- Data set validation scripts
- Document classification script
-
Event handlers
- Batch created
- Batch deleted
- Batch parameter change
- Batch structure change (page added/page deleted/document added/document deleted)
- Pages moved
- Batch opened/closed
- Batch integrity check
- Document parameter changed
- Document state changed
- Export completed
- Script that is run after rule checks
- Before matching
- Field verification request
-
Objects
- IActionResult
- IAssemblingError
- IAssemblingErrors
- IBatch
- IBatchCheckResults
- IBatchItem
- IBatchItems
- IBatchTypeClassifier
- IBatchTypeClassifierResult
- IBinarizationParams
- IBoxedBoolean
- ICharacterParams
- ICharactersParams
- ICheckmarkGroupValue
- ICheckmarkValue
- IDataSet
- IDataSetQuery
- IDataSetRecord
- IDocument
- IDocuments
- IDocumentExportResults
- IDocumentsExportResults
- IDocumentDefinitionInfo
- IDocumentDefinitionInfoArray
- IEditablePictureObject
- IExportFieldsToRedact
- IExportImageSavingOptions
- IField
- IFieldRegion
- IFieldRegions
- IFields
- IFlexiCaptureTools
- ILocalContrastParams
- IMatchedSectionInfo
- IMatchingInfo
- IPage
- IPageClassificationResult
- IPages
- IPictureObject
- IPictureObjectsInfo
- IPrincipal
- IPrincipals
- IProcessingCallback
- IProject
- IProperties
- IProperty
- IPropertyModificationInfo
- IRecordCheckResult
- IRecordset
- IRect
- IRects
- IRoutingRuleResult
- IRuleContext
- IRuleError
- IRuleErrors
- IRuleTag
- IRuleTags
- IScriptBinaryAttributes
- IScriptDefinitionContext
- ISectionDefinitionInfo
- ISectionDefinitionInfoArray
- IShadowsHighlightsParams
- IStageInfo
- IUserAttachment
- IUserAttachments
- IUserSessionInfo
- IValue
- IVARIANTArray
- TAssemlingErrorType
- TBatchItemType
- TColorToFilter
- TExportFieldType
- TExportType
- TImageCompressionType
- TPageClassificationType
- TPdfAVersion
- TPdfDocumentInfoType
- TPdfTextSearchAreaType
- TPrincipalType
- TProcessingPriority
- TPropertyType
- TRuleErrorType
- TStateType
- Sample scripts
- Internal names of recognition languages
-
Types of scripts
-
Scripts for processing interface events
-
Event handlers
- On Document Closed
- On Project Closed
- On Activate Document
- On Field Control Activate
- On Return From Task
- On User Command
- On Field Control Deactivate
- On Closing Document
- On Task Close
- On Closing Project
- On Region Change
- On Task Window Mode Changed
- On Open Document
- On Task Window Create
- On Task Reject
- On Region Control Draw
- On Task Send To Stage
- On Text Field Validating
-
Objects
- IBoolean
- IBoxedFieldControl
- IDocumentEditor
- IDocumentItem
- IDocumentItems
- IDocumentsCollection
- IDocumentsWindow
- IDrawContext
- IErrorControl
- IErrorControls
- IErrorsWindow
- IFieldControl
- IFieldRegionControl
- IFieldRegionControls
- IFormWindow
- IImageWindow
- IMainMenu
- IMainWindow
- IMenu
- IMenuItem
- IPageControl
- IPageItem
- IPageItems
- IPagesCollection
- IPoint
- ISelection
- IShellRational
- IShellRect
- IShellRects
- ITaskWindow
- ITextEditor
- IToolbar
- IToolbarButton
- IToolbars
- TCommandBarType
- TCommandID
- TDockingType
- TDocumentState
- TErrorType
- TSelectionType
- TTaskWindowMode
- TTextSize
- TUserRole
- TWorkWindowType
-
Event handlers
- User scripts for the Web Verification Station
- Creating a machine-readable form
- Hot keys
- Additional options
- Description of Processing Server commands
- ABBYY FlexiCapture sample projects
- Supported recognition languages
- Supported classifier languages
- Fonts for correct characters rendering
- Supported text types
- Supported barcode types
- Supported input formats
- Processing PDF files
- Export file formats
- Date formats
- Alphabet used in regular expressions
- Patents
- Third-party technologies
- Glossary
- Technical support
- How to buy ABBYY FlexiCapture
- End-User License Agreement (EULA)
-
Using scripts in ABBYY FlexiCapture
Document Definition properties
The basic Document Definition properties, such as its name, language and writing style, are configured with the help of the Document Definition Wizard. Other properties are configured by default.
You can view and change the properties of a Document Definition in the properties dialog box that opens when you select the menu item Document Definition → Document Definition Properties... in the Document Definition editor window.
The dialog box has the following tabs:
- The General
tab
On this tab you can rename the Document Definition and enter or edit its description. The Enabled option includes/excludes the Document Definition from document processing.
- The Recognition
tab
The program uses fast recognition that is called full-text recognition for classification, matching of a FlexiLayout, and highlighting of text on images.
This tab is used to specify settings of full-text recognition. Please note that field recognition settings are specified in field properties.
- Prefer settings from batch type. Select this option if you want to synchronize full-text recognition settings.
Note: Disabling the synchronization may lead to slower Document Definition matching.
- Languages. It is necessary to set a correct language for recognition to proceed without errors. This property defines both the language itself and other settings like date format, currency, etc.
- Select a Recognition mode from the list. In the Fast mode, colored and half-tone images are binarized prior to recognition (converted to black and white). Fastrecognition takes less time and provides mostly satisfactory results. In the Balanced mode, the program also considers image colors, recognition proceeds slower but with better quality. It is advised that you use the Thorough mode only when prerecognition results in multiple errors. By default, the Balanced mode is used.
- Advanced prerecognition settings…
- You may select one or several options from the Correct page orientation, if page group for a page to rotate 180°, 90° clockwise or 90° counter-clockwise when its orientation is selected automatically.
- If necessary, specify the way a blank form will print (Typographic, Matrix printer, Typewriter) in the Text type section, add a sample pattern.
- Barcodes. Parameters of barcode processing:
- Disable barcode extraction. Select this option if barcodes should not be found on images. This will speed up document recognition considerably.
- Extract 2D barcodes: Data Matrix, Aztec, QR Code. Select this option if your images contain barcodes of specified types. If the option is not selected, Data Matrix, Aztec, and QR Code barcodes will not be found on images.
- Extract post barcodes. Select this option if your images contain postal barcodes. If this option is not selected, postal barcodes will not be found on images.
Important! Extracting barcodes slows down recognition.
- CJK pre-recognition
- Separated furigana mode. Select this option to improve recognition of phonetic tips (furigana) in the Japanese language.
- Named entity recognition: Extract named entities – extraction of information using NLP methods.
Note: Requires an NLP module and a specific license type.
- Vertical text extraction – Vertical text extraction parameters:
- Extract for all languages – Detects vertically-oriented text written in any of the supported languages.
- Do not extract – Prevents the detection of vertically-oriented text.
- Extract for CJK languages – Detects vertical text written in Chinese, Japanese or Korean.
- Click the Advanced... button to configure correction of linear and nonlinear distortions of images, specify direction of scanner's automatic feeder, etc.
Note: Select the Correct linear distortion option to specify parameters of image spreading/compression by height and width. Images are scaled by existing anchors (black square, cross or corner) as well as horizontal and vertical separators.
- Amount of Money – A combination of a numerical amount and a currency code or symbol. In order to avoid any recognition errors for visually similar characters like 1, I, and i, or s and $, a regular expression is used which allows letters only in certain combinations that represent currency codes, either preceding or immediately following the numerical amount. The major currency codes are listed in Currencies.
You can modify the list of possible currency codes and symbols if required. For example, if you know what currency codes and symbols may occur in your documents, removing any redundant currencies from the list will improve the quality of recognition. You can also add custom currency codes and symbols to the list. To modify the list, click the […] button on the right. In the Currency Symbols dialog box, you can add or remove currency codes or symbols. Alternatively, open the field properties dialog box, click the Data tab, and make the necessary changes. For more information, see Data types of the text entry field.Note: A Document Definition can only have one list of possible currency codes and symbols. This list is applied to all Amount of Money fields.
- Prefer settings from batch type. Select this option if you want to synchronize full-text recognition settings.
- The Assembly
tab
This tab is intended for configuring assembly rules for multipage documents.
In the simplest scenario, the Document Definition comprises a single section that occurs once. If a Document Definition consists of several sections, this tab will show the list of their names. You can specify the number of occurrences of each section by modifying the numbers in the Min number and Max number columns.
- Use key fields equality assembling rule enable this option if you want to perform a check of document assembly based on key fields. Then select a key field for each section in the Key Field column. When you input documents, only documents with the matching values of key fields in each section will be considered correctly assembled. If their values do not match, an assembly error message will be displayed.
- Use standard assembly rules - enable this option if you want to perform a check of document assembly using the following standard rules:
- Disable sections order check - enable this option if you want to disable the checks for the order of sections in the document (e.g. if the order of sections does not affect document assembly). The program will still check that all the sections are present in the document, but their order will be ignored.
- Enable annex pages - enable this option if you want to process documents with annexes. If processing document with annexes is enabled, you can also select the option Detect annexes using preset document structure, without analyzing (fast) to enable faster detection of annexes on the basis of the present document structure.
Note: The Detect annexes using preset document structure, without analyzing (fast) option is effective only for documents created by means of separation during the import stage or by applying a special flag in API. Such document are excluded from the assembly.
- Use custom assembly rules - enable this option if you want to perform a check of document assembly using a document assembly script. A custom assembly script can be executed both separately and together with the standard assembling rules. To start editing the script, click the Edit Assembly Script... button. The Script Editor window will open.
For details see Creating Document Definitions for multipage documents, Assembling pages into documents and Creating Document Definitions for documents with annexes.
- The Rules
tab
This tab is intended for actions with Document Definition rules. You can delete, edit or create new rules.
For details see Rule validation.
- The Export Destinations
tab
This tab shows the current export settings of the given Document Definition. To change the export settings, click the Edit... button
- The Data Form
tab
On this tab you can modify the font outline and size for displaying recognized data.
- The Data Text Settings group contains font settings for displaying recognized values.
- The Label Text Settings group contains settings for displaying the explanatory text (field names).
For details see Configuring data presentation in the document window.
- The Data Sets
tab
On this tab you can create and edit custom data sets.
For details, see Using vendor and business unit databases.
- The Event Handlers
tab
On this tab you can specify event handlers for documents of the current type.
For details, see Event Handlers.
- The .Net References
tab
On this tab you can add external assemblies to be used in scripts and global modules. Both standard and compiled user assemblies can be used. To add an assembly, click Add.... In the dialog box that opens select the type: Standard assembly name or Attached file. Depending on the selected type either enter the standard assembly name or browse to an assembly file.
For details see External assemblies.
02.03.2021 8:10:42