Output file types
In ABBYY Vantage, processing and OCR results are available in JSON and XML formats:
Vantage will generate a separate JSON or XML file or a set of such files for each processed document — depending on the skill applied to the document and the Output activity settings if a Process skill is used.
The following table describes the types of output files available in Vantage. All of these file types are part of the ResultFileType enumerator in the Vantage API.
File type | Description |
---|---|
Json | Extracted data, such as field values, field structure, and rule check errors. |
PDF document with or without a text layer. | |
Text | Extracted text data in TXT format. |
Docx | Extracted text and image data in DOCX format. |
Xlsx | Extracted text and image data in XLSX format. |
OcrJson | Full-text recognition data in JSON format. |
FieldsJson | Simplified JSON file. Contains only field values and rule check errors. |
FieldPicture | Extracted image fields in JPG format. |
Xml | Extracted full-text recognition data. |
Tiff | Extracted image data in TIFF format. |
Jpeg | Extracted image data in JPG format. |
Csv | Extracted data values of repeating or non-repeating fields. |
Html | Extracted full-text recognition data in HTML format. |
Pptx | Extracted text and image data in PPTX format. |
Alto | Extracted full-text recognition data in XML format that corresponds toALTOstandard, schema version4.2. |
When working with the Vantage API, you can get information about output files of the processed transaction using the GET request. The file type is in the type property of each object in the resultFile array.
For more information about file types, see Output activity and Image (FieldPicture).
12/22/2023 12:36:42 PM