Dialog Box: TXT

This dialog box allows you to specify TXT format settings.

Set the parameters for saving recognized text in TXT format:

Option name Option description
Text settings group
Keep line breaks Select this option if you want the original arrangement into lines to be retained the TXT format, otherwise the text will be formatted in a single line in the TXT file.
Insert page break character (#12) to separate pages Select this option if you want the original page arrangement to be retained in TXT format.
Use blank line as paragraph separator Select this option if you want the paragraphs to be separated by blank lines in the TXT file.
Keep original headers and footers If this option is selected, original headers and footers will be preserved in the TXT file.
Character encoding group

Encoding type

(drop-down list)

Specifies the encoding type of the output file in TXT format:

  • Simple
    Simple encoding, one byte per symbol.
  • Unicode UTF-16
    Native Unicode format where every symbol is represented by two-byte sequence.
  • Unicode UTF-8
    Unicode UTF-8 format. UTF-8 is a code page that uses a string of bytes to represent a 16-bit Unicode string where ASCII text (<=U+007F) remains unchanged as a single byte, U+0080-07FF (including Latin, Greek, Cyrillic, Hebrew, and Arabic) is converted to a 2-byte sequence, and U+0800-FFFF (Chinese, Japanese, Korean, and others) becomes a 3-byte sequence.

Code page

(drop-down list)

By default the code page is detected automatically. Select the Automatic value to use the automatic detection. Still, you may select the code page manually if necessary, just choose the value you need from the list.

See also

Output Format Settings Dialog Box

20.09.2022 9:27:51

Please leave your feedback about this article

Usage of Cookies. In order to optimize the website functionality and improve your online experience ABBYY uses cookies. You agree to the usage of cookies when you continue using this site. Further details can be found in our Privacy Notice.