IFieldExtractor
Purpose
Identifies fields in the text of a document.
注: Can only be changed in an extraction script.
Methods
Name | Description |
---|---|
ExtractRegularExpression( regularExpression : string, resultCollectionName : string ) |
Specifies a regular expression for identifying text spans. |
ExtractNerObjects() |
Tells the field identification mechanism to identify NER entities in the text of a document. Once the objects are identified, the field identification mechanism will have collections available with the following predefined names: NerPerson, NerOrg, NerGeo, NerAddress, NerMoney, and NerDate. 注: The NerMoney and NerDate objects are used only in extraction scripts and are not available in ABBYY FlexiLayout Studio. |
ExtractWordsFromUserDictionary( userDictionaryName : string, languageName : string ) |
Tells the field identification mechanism to identify words from a user dictionary in the text of a document. Words may occur in the text in any inflected form. A user dictionary can be selected on the Properties tab of the script rule. The dictionary will be accessed by its name.
|
ParseAddress() | Parses the text in a field or section into address components. |
ParseAddressInPosition( resultCollectionNamePrefix : string, startPos : int, endPos : int ) | Parses the text fragment between specified start and end positions in a field or section into address components. |
ParseAddressInSpan(resultCollectionNamePrefix : string, span : IInterval ) | Parses the text fragment within a specified interval in a field or section into address components. |
PutSpanToField( startPos : int, endPos : int, fieldName : IField ) | Saves the text fragment corresponding to the span specified for a text substring to a document field. |
PutTextToField( startPos : int, endPos : int, fieldName : IField ) | Saves the text fragment corresponding to the start and end positions specified for a text substring to a document field. |
RunQuery( xmlQuery : string, queryName : string ) : IExtractedObjects |
Runs an XML query on the text of a document and identified text spans. Returns a collection of results as an array of text spans containing the identified resulting strings. The queryName parameter specifies a name for the query, which can then be used to get the resulting collection from the field identification mechanism. |
RunQueryAndSaveToField( xmlQuery : string, queryName : string, fieldName : string ) | Runs an XML query on the text of a document and identified text spans and saves the results to a document field. |
SaveSpanToField( span : IInterval, fieldName : string ) |
Saves the text fragment corresponding to the span specified for a text substring to a document field. 重要! This method is obsolete in FlexiCapture Release 3 Update 4 and later. If this method is used for new projects in FlexiCapture Release 3 Update 4 and later, such projects will be sent to Exceptions. field. |
SaveTextToField( startPos : int, endPos : int, fieldName : string ) |
Saves the text fragment corresponding to the start and end positions specified for a text substring to a document field. 重要! This method is obsolete in FlexiCapture Release 3 Update 4 and later. If this method is used for new projects in FlexiCapture Release 3 Update 4 and later, such projects will be sent to Exceptions. |
ExtractedObjects( collectionName : string, [optional] objectTypeName : VARIANT) : IExtractedObjects |
Allows accessing a collection of identified objects by the name of the collection. For collections of NER objects identified as address components, do one of the following:
|
QueryResults( queryName : string ) : IExtractedObjects | Allows accessing the result of an XML query by the name of the query. |
Properties
Name | Type | Permissions | Value |
---|---|---|---|
SourceText() | string | Read | The text of the document or field to which the field identification mechanism is applied. |
SourceNode() |
IField | Read | The field or the section to which the field identification mechanism is applied. |
SourceDocument() | IDocument | Read | The document that contains SourceNode. |
4/12/2024 6:16:06 PM