Pages array elements
Each element in the pages array describes a single document page and in turn contains arrays of its own representing texts, tables, images, barcodes, checkmarks, and separators detected on that page. The properties of each element are used to store coordinates for the corresponding region on the image. The confidence level of the element being recognized correctly is also stored in the properties of the text elements.
If the document is exported to JSON but not to any of the image formats, the coordinates will be specified relative to the original image. If the document is exported to JSON and at the same time to one of the image formats, the coordinates will be specified relative to the preprocessed image that is exported.
All text elements are placed into either the texts array or the tables array.
Property | Data type | Description |
---|---|---|
width | integer | The width of the page in pixels. |
height | integer | The height of the page in pixels. |
rotated | string enum |
Rotation of the page relative to the original image. Possible values:
|
texts | object array | Array of text blocks. |
tables | object array | Array of blocks containing tables. |
pictures | picture object array | Array of image blocks. |
barcodes | barcode object array | Array or barcode blocks. |
separators | object array | Array of separator blocks. |
checkmarks | object array | Array of checkmark blocks. |
All objects that describe recognized text or images have a property called confidence, which indicates the likelihood of the text being recognized correctly.
First, a level of confidence is calculated for individual characters. The level of confidence for elements of higher levels is calculated using confidence levels for elements that they contain.
A special type of data called confidence is defined for this property. This data type is a derivative of the number data type.
The allowed values are from 0 to 100. A value of -1 indicates an element that does not contain any text data.
22.12.2023 12:36:42