Working with Recognized Data
After recognition, an ABBYY FlexiCapture SDK document (the Document object) contains a set of pages with recognized blocks on them and a set of fields. The blocks contain the data received after recognition, while the fields contain the recognized data, to which the rules defined in the Document Definition were applied. For this reason, the content of the blocks may differ from the values of the corresponding fields.
How to find a field
A collection of document sections (IDocument::Sections) usually contains one or several Field objects. However, these are not all of the fields of the document. The fields in this collection are sections of the document. These sections have child fields. Each child field can be either a group of fields (and have its own child fields) or a simple field (without child fields). The document itself can be considered as the topmost-level field which contains all the data fields of the document. To access the child fields of a complex field, use the IField::Children property.
A field may have several instances. Field instances are fields that occur several times in the form and describe several variants of the same object, e.g. several telephone numbers or addresses of one person, several table rows of one consignment note, etc. To receive a collection of instance fields, use the IField::Instances property. This collection will contain a set of fields of the same type as the container field.
When working with the recognized data of a document, you must take into account the fact that the document may have a complex tree-like structure, i.e. it may consist of sections, the sections may contain fields and groups of fields, and any of these may re-occur multiple times. Below is a sample of how to go through the entire document tree, visiting every field and every instance of the repeating items. Notice that the collection of instances is checked first to determine if the item is repeating, after that the collection of children is checked to determine if the item is a container of fields:
C# code
How to retrieve data from a field
Access to the recognized data of a field is provided by the IField::Value property. This property returns a reference to the FieldValue object. For composite fields, the property may return NULL. The FieldValue object contains recognized data in various formats. The formats in which the data can be presented are defined by the type of these data (IFieldValue::Type property). For example, you can receive recognized date and time in DATE format, as a string, or as a Text object.
The table below lists the ways of working with the recognized data of a field depending on their type. We presume that the field does not have instance fields (except for the table field).
Field type | Children | Field value | How to work with the field |
FT_TextField | NULL | Field value of type FVT_Text |
You can receive the value of the field:
C++ (COM) code |
FT_DateTimeField | NULL | Field value of type FVT_DateTime |
You can receive the value of the field:
C++ (COM) code |
FT_NumberField | NULL | Field value of type FVT_Number |
You can receive the value of the field:
C++ (COM) code |
FT_PictureField | NULL | Field value of type FVT_Picture |
You can receive the value of the field as a FieldPictureValue object using the IFieldValue::AsPicture property. C++ (COM) code |
FT_Checkmark | NULL | Field value of type FVT_Boolean |
You can receive the value of the field:
C++ (COM) code |
FT_CheckmarkGroup | Fields of the type FT_Checkmark | Field value of type FVT_Choice |
You can receive the value of the field:
You can also receive the values of checkmark fields in the group via the child objects of the checkmark group field. Use the IField::Children property to receive a collection of child fields of type FT_Checkmark. Work with each field as described above for this type of field. C++ (COM) code |
FT_Group | Fields | NULL | Receive the collection of child fields using the IField::Children property. Work with each child field depending on its type. |
FT_PageGroup | Fields | NULL | Receive the collection of child fields using the IField::Children property. Work with each child field depending on its type. |
FT_Table | NULL or Fields | NULL |
A table field in ABBYY FlexiCapture SDK contains a collection of instances which correspond to the table rows. Each row field is of type FT_Table and contains a collection of child fields which correspond to the cells of this row. For each row, the set of field types will be the same. Therefore, you can receive the value of a cell as follows:
C++ (COM) code |
FT_Document | Fields | NULL | Receive the collection of child fields using the IField::Children property. Work with each child field depending on its type. |
FT_CurrencyField | NULL | Field value of type FVT_Currency |
You can receive the value of the field:
C++ (COM) code |
See also
15.08.2023 13:19:30