Output file types

In ABBYY Vantage, processing and OCR results are available in JSON and XML formats:

Vantage will generate a separate JSON or XML file or a set of such files for each processed document — depending on the skill applied to the document and the Output activity settings if a Process skill is used.

The following table describes the types of output files available in Vantage. All of these file types are part of the ResultFileType enumerator in the Vantage API.

File type Description
Json Extracted data, such as field values, field structure, and rule check errors.
Pdf PDF document with or without a text layer.
Text Extracted text data in TXT format.
Docx Extracted text and image data in DOCX format.
Xlsx Extracted text and image data in XLSX format.
OcrJson Full-text recognition data in JSON format.
FieldsJson Simplified JSON file. Contains only field values and rule check errors.
FieldPicture Extracted image fields in JPG format.
Xml Extracted full-text recognition data.  
Tiff Extracted image data in TIFF format.
Jpeg Extracted image data in JPG format.
Csv Extracted data values of repeating or non-repeating fields.
Html Extracted full-text recognition data in HTML format.
Pptx Extracted text and image data in PPTX format.
Alto Extracted full-text recognition data in XML format that corresponds toALTOstandard, schema version4.2.

When working with the Vantage API, you can get information about output files of the processed transaction using the GET   request. The file type is in the type property of each object in the resultFile array.

For more information about file types, see Output activity and Image (FieldPicture).

22.12.2023 12:36:42

Usage of Cookies. In order to optimize the website functionality and improve your online experience ABBYY uses cookies. You agree to the usage of cookies when you continue using this site. Further details can be found in our Privacy Notice.