ABBYY FineReader Engine Glossary

This article includes basic terms, abbreviations and definitions that are used in ABBYY FineReader Engine help. Check out the list below when getting started with ABBYY FineReader Engine or learning more about a particular term.

A-E F-O P-Z

ABBYY FineReader Engine

API

Binarization

Block

CJK

CLI

Confidence level

Confusion matrix

Container

Counter

DLL

Document analysis

Dpi

EULA

EXIF

Extraction

Hypothesis

ICAO

ICR

Library module

License module

Ligature

Loaders

MRC

OBR

OCR

OMR

PDFium

Preprocessing

Profile

Pt

Rasterization

Recognition

Region

Scenario

Synthesis

Twip

Virtual Machine

XFA

Workstation

ZUGFeRD

ABBYY FineReader Engine

A software development kit that allows software developers to create applications that extract textual information from paper documents or images. Alias FineReader Engine, FRE, FREngine.

API

Application Programming Interface. Learn more: Wikipedia.

Binarization or Adaptive Binarization

The process of converting a grey-scale or color image into black-and-white image. When recognizing a document, this process helps to dynamically adjust the brightness balance for each image fragment.

Block

An internal entity of the ABBYY FineReader Engine API that defines the area of the source image and its type (text, barcode, checkmark, etc.). Blocks can be marked manually or created automatically after document analysis. Blocks determine how and in what order the image areas to be recognized and after recognition contain its results. Learn more: Block, Working with Layout and Blocks.

CJK or CJK languages

Chinese (Simplified and Traditional), Japanese, and Korean languages. Learn more: Recognizing CJK Languages.

CLI

Command-line interface. To see how it is implemented in ABBYY FineReader Engine, you may use the appropriate code sample included in the distribution package. This code sample supports most of the ABBYY FineReader Engine API functions through numerous option keys.

Confidence level

The probability of using a particular character in a text. Learn more: Using Voting API.

Confusion matrix or error matrix

A table representing the results of an algorithm used for object classification. Learn more: Wikipedia, ConfusionMatrix.

Container

A software unit with an entire runtime environment inside using to pack an application with all its dependencies to ensure stable operating in various computing environments. Learn more: Docker website.

Counter or License Counter

A license entity for the limitation of the pages or characters recognized and exported during a certain period.

DLL

Dynamic link library. ABBYY FineReader Engine distribution package for Windows includes a set of .dll to be integrated into customer’s products. Learn more: Wikipedia.

Document analysis

The step of OCR responsible for detecting the elements of document structure and creating a layout that exposes a collection of blocks. Learn more: Document Analysis, Working with Layout and Blocks.

Dpi

Dots per inch. Learn more: Wikipedia.

EULA

End-User License Agreement. This document is included in the ABBYY FineReader Engine distribution package or can be accessed via online help. Learn more: Wikipedia.

EXIF

Exchangeable image file format. A standard describing specific data that comes along with an image or audio file captured by a digital camera (for example, GPS location, date/time, camera settings, etc.). Learn more: Wikipedia.

Extraction

The process of data mining from pictures or texts. This process is applied to the scenarios of recognizing the texts, barcodes, fields, or MRZ. Additionally, extraction is used to retrieve additional objects on the images. Learn more: ObjectExtractionParams.

Hypothesis

A recognition variant of a single character or word in a text. Each hypothesis has a confidence level that is useful in situations when it is necessary to select the most appropriate variant among several. Learn more: Using Voting API.

ICAO

International Civil Aviation Organization. An organization that determines the standards and specifications for machine-readable travel documents. Learn more: ICAO website.

ICR

Intelligent Character Recognition. The technology used for the recognition of the characters that are hand-printed and separated into individual characters. These characters are to be detected in fields, boxes, and frames of the documents. Learn more: OCR and Other Recognition Technologies, Recognizing Handprinted Texts.

Library module

A set of License modules forming the ABBYY FineReader Engine function available for the user and determining the files required to perform this function.

License module

A licensing entity used for providing access to a certain functionality of ABBYY FineReader Engine. Each License module corresponds to a certain Library module to be installed. Learn more: Modules.

Ligature

A character formed by combining two or more characters. Learn more: Wikipedia.

Loaders or Engine loaders

The interfaces and objects for initializing the main Engine object of the ABBYY FineReader Engine API. There are several ways to load the Engine object on various operating systems:

Windows Linux Mac
Standalone Application
  • By using standard InitializeEngine function
  • By means of COM using InprocLoader
By using standard InitializeEngine function

By using standard InitializeEngine function

Server Solution By means of COM using OutprocLoader As an out-of process server Not supported

To choose what is preferable for your scenario (single-threaded or multi-threaded application, in-process or out-of-process loading, working with GUI, etc.), see Differences between ABBYY FineReader Engine for Windows and Mac.

MRC

Mixed Raster Content. This technology can be applied to PDF (PDF/A) files and represents a document as three different layers: the first one is a foreground plane with pictures, the second one is a mask plane with the text and its coloring, and the third one is a background plane with background pictures or textures. Each layer is compressed separately using the best type of compression for that data type. Learn more: PDF Conversion.

OBR

Optical Barcode Recognition. The process of automatic detection, recognition, and identification of barcode on an image. Learn more: OCR and Other Recognition Technologies, Barcode Types, Recognizing Barcodes.

OCR

Optical Character Recognition. The multi-step process of electronic conversion of images with handwritten, typewritten, or printed text (usually captured by a scanner) into machine-editable text. It includes preprocessing, document analysis, recognition, and synthesis. Learn more: OCR and Other Recognition Technologies.

OMR

Optical Mark Recognition or Checkmark Recognition. The process of automatic detection and recognition of checkmarks on the image or document region defined by the developer. Learn more: OCR and Other Recognition Technologies, Recognizing Checkmarks.

PDFium

A cross-platform library used for opening PDF, converting them to images, or extracting attachments, fonts, and metadata from them. Learn more: Googlesource.

Preprocessing or Image Preprocessing

The process that allows to improve the quality of document images for further recognition or archiving. Learn more: Image Preprocessing.

Profile

A set of the ABBYY FineReader Engine parameters specified to the reasonable defaults. Each profile can be applied to a certain scenario of using ABBYY FineReader Engine. Learn more: Working with Profiles.

Pt

Point or typographical point that is equal to 1/72". Learn more: Wikipedia.

Rasterization

The process of converting an image into a raster image, e.g., consisting of pixels, dots, or lines. Learn more: Wikipedia.

Recognition

The process of data extraction (texts, barcodes, checkmarks, etc.) from every block on an image to be converted into machine-editable information. Learn more: OCR and Other Recognition Technologies.

Region

An internal entity of ABBYY FineReader Engine formed around one or several document elements. A single region may contain one or several rectangles. Learn more: Region.

Scenario

A set of steps and recommendations most suitable for a certain task of document processing using ABBYY FineReader Engine. Learn more: Basic Usage Scenarios Implementation.

Synthesis

A step of OCR responsible for the document logical structure (table of contents, text order, and font styles, headings, etc.) detection using ABBYY FineReader Engine. This step can be skipped when exporting the recognition result to a file of TXT format or an image-only PDF.

Twip

The typographical unit of measurement equal to 1⁄20 of a typographical point or 1/1440 of inch. Learn more: Wikipedia.

Virtual Machine

An emulation of a physical computer and its characteristics in a virtual environment. It can be accessed remotely or additionally installed on a computer. Learn more: Wikipedia.

XFA

XML Forms Architecture. An XML specification that describes the processing rules of interactive web forms with user-specified data. Learn more: Wikipedia.

Workstation

A machine intended for installing and working with the ABBYY FineReader Engine library. To develop an application based on the library, you need a Developer License. In the case of using the distributed application, a Runtime License is enough. Learn more: Licensing.

ZUGFeRD

Zentraler User Guide des Forum elektronische Rechnung Deutschland (German) or Central User Guide for Electronic Invoicing (English). A German standard for electronic invoicing that determines a format for integration of PDF documents (should satisfy the PDF/A-3 standard) and XML structured data. Learn more: ZUGFeRD-compliant electronic invoices.

03.07.2024 8:50:10

Usage of Cookies. In order to optimize the website functionality and improve your online experience ABBYY uses cookies. You agree to the usage of cookies when you continue using this site. Further details can be found in our Privacy Notice.