Working with complex-script languages
With Verification Station, you can recognize documents in Arabic, Hebrew, Yiddish, Thai, Chinese, Japanese, and Korean. Some additional factors must be taken into account when working with documents in Chinese, Japanese or Korean and with documents in which a combination of CJK and European languages is used.
- Recommended fonts
- If non-European characters are not displayed in the Text window
- Changing the direction of recognized text
Recognition of text in Arabic, Hebrew, Yiddish, Thai, Chinese, Japanese, and Korean may require additional fonts to be installed. The table below lists the recommended fonts for texts in these languages.
OCR languages | Recommended font |
Arabic | Arial™ Unicode™ MS |
Hebrew | Arial™ Unicode™ MS |
Yiddish | Arial™ Unicode™ MS |
Georgian |
Arial™ Unicode™ MS Sylfaen |
Thai |
Arial™ Unicode™ MS Aharoni David Levenim mt Miriam Narkisim Rod |
Chinese (Simplified) Chinese (Traditional) Japanese, Korean Korean (Hangul) |
Arial™ Unicode™ MS SimSun fonts such as: Example SimSun (Founder Extended), SimSun-18030, NSimSun. Simhei YouYuan PMingLiU MingLiU Ming(for-ISO10646) STSong |
The sections below contain advice on improving recognition accuracy.
If non-European characters are not displayed in the Text pane
If text in a CJK language is displayed incorrectly in the Text pane, you may have selected the Plain text mode.
To change the font used in Plain text mode:
- Click Tools > Options... to open the Options dialog box.
- Click the Areas and Text tab.
- Select Arial Unicode MS from the Font used to display plain text drop-down list.
- Click OK.
If this did not help and text in the Text window is still displayed incorrectly, see Incorrect font is used or some characters are replaced with "?" or "□".
Changing the direction of recognized text
Verification Station detects text direction automatically, but you can also specify text direction manually.
- Activate the Text pane.
- Select one or more paragraphs.
- Click the button on the toolbar in the Text pane.
You can use the Direction of CJK text drop-down list in the Image pane to specify the direction of text prior to OCR. See also: Editing area properties.
3/26/2024 1:49:49 PM