Predefined Languages in ABBYY FineReader Engine
Here is the list of internal names of the predefined languages that are supported in ABBYY FineReader Engine. Availability of this or that predefined recognition language depends on the availability of the corresponding modules among ABBYY FineReader Engine files. See the Installation section to know which recognition languages correspond to which ABBYY FineReader Engine modules.
ABBYY FineReader Engine provides core recognition languages for OCR and ICR with full built-in dictionary support. Some recognition languages are available only for OCR, or do not have full built-in dictionary support. See details in the table below.
ABBYY FineReader Engine additionally provides a set of specific recognition languages. These languages contain special language units (addresses, date and time, human names, etc.). Such languages can be used for field recognition. See the list of special predefined languages for more information.
Internal name | Recognition language | Can be used for OCR | Full dictionary support available | Can be used for ICR | Can be used for text-based classification* | Can be used for BCR |
Abkhaz | Abkhaz | + | ||||
Adyghe | Adyghe | + | ||||
Afrikaans | Afrikaans | + | + | |||
Agul | Agul | + | ||||
Albanian | Albanian | + | + | |||
Altaic | Altaic | + | ||||
Arabic | Arabic (Saudi Arabia) | + | + | ** | + | |
ArmenianEastern | Armenian (Eastern) | + | + | + | ||
ArmenianGrabar | Armenian (Grabar) | + | + | + | ||
ArmenianWestern | Armenian (Western) | + | + | + | ||
Awar | Avar | + | ||||
Aymara | Aymara | + | + | |||
AzeriCyrillic | Azerbaijani (Cyrillic) | + | ||||
AzeriLatin | Azerbaijani (Latin) | + | + | + | + | |
Bangla | Bangla | + | ||||
Bashkir | Bashkir | + | + | + | ||
Basic | Basic programming language | + | ||||
Basque | Basque | + | + | |||
Belarusian | Belarussian | + | ||||
Bemba | Bemba | + | + | |||
Blackfoot | Blackfoot | + | + | |||
Breton | Breton | + | + | |||
Bugotu | Bugotu | + | + | |||
Bulgarian | Bulgarian | + | + | + | + | |
Burmese | Burmese | + | ||||
Buryat | Buryat | + | + | |||
C++ | C/C++ programming language | + | ||||
Catalan | Catalan | + | + | + | ||
Chamorro | Chamorro | + | + | |||
Chechen | Chechen | + | ||||
Chemistry | Simple chemical formulas | + | ||||
ChinesePRC | Chinese Simplified | + | + | |||
ChineseTaiwan | Chinese Traditional | + | + | |||
Chukcha | Chukcha | + | ||||
Chuvash | Chuvash | + | ||||
CMC7 | For MICR (CMC-7) text type | + | ||||
COBOL | COBOL programming language | + | ||||
Corsican | Corsican | + | + | |||
CrimeanTatar | Crimean Tatar | + | + | |||
Croatian | Croatian | + | + | + | + | |
Crow | Crow | + | + | |||
Czech | Czech | + | + | + | + | + |
Danish | Danish | + | + | + | + | + |
Dargwa | Dargwa | + | ||||
Digits | Numbers | + | + | |||
Dungan | Dungan | + | ||||
Dutch | Dutch (Netherlands) | + | + | + | + | + |
DutchBelgian | Dutch (Belgium) | + | + | + | + | |
E13B | For MICR (E-13B) text type | + | ||||
English | English | + | + | + | + | + |
EskimoCyrillic | Eskimo (Cyrillic) | + | ||||
EskimoLatin | Eskimo (Latin) | + | ||||
Esperanto | Esperanto | + | ||||
Estonian | Estonian | + | + | + | + | + |
Even | Even | + | + | |||
Evenki | Evenki | + | + | |||
Faeroese | Faeroese | + | ||||
Farsi | Farsi | + | + | + | ||
Fijian | Fijian | + | + | |||
Finnish | Finnish | + | + | + | + | + |
Fortran | Fortran programming language | + | ||||
French | French | + | + | + | + | + |
Frisian | Frisian | + | + | |||
Friulian | Friulian | + | + | |||
GaelicScottish | Scottish Gaelic | + | + | |||
Gagauz | Gagauz | + | ||||
Galician | Galician | + | + | |||
Ganda | Ganda | + | + | |||
Georgian | Georgian*** | + | ||||
German | German | + | + | + | + | + |
GermanLuxembourg | German (Luxembourg) | + | + | |||
GermanNewSpelling | German (new spelling) | + | + | + | + | |
Greek | Greek | + | + | + | + | + |
Guarani | Guarani | + | + | |||
Hani | Hani | + | + | |||
Hausa | Hausa | + | ||||
Hawaiian | Hawaiian | + | + | |||
Hebrew | Hebrew | + | + | + | ||
Hungarian | Hungarian | + | + | + | + | + |
Icelandic | Icelandic | + | ||||
Ido | Ido | + | + | |||
Indonesian | Indonesian | + | + | + | + | + |
Ingush | Ingush | + | ||||
Interlingua | Interlingua | + | + | |||
Irish | Irish | + | + | |||
Italian | Italian | + | + | + | + | + |
Japanese | Japanese | + | + | + | + | |
JapaneseModern | Japanese (Modern) | + | + | + | + | |
Java | Java programming language | + | ||||
Kabardian | Kabardian | + | ||||
Kalmyk | Kalmyk | + | ||||
KarachayBalkar | Karachay-Balkar | + | + | |||
Karakalpak | Karakalpak | + | ||||
Kasub | Kasub | + | + | |||
Kawa | Kawa | + | + | |||
Kazakh | Kazakh | + | + | |||
Khakas | Khakas | + | ||||
Khanty | Khanty | + | ||||
Kikuyu | Kikuyu | + | ||||
Kirgiz | Kirghiz | + | + | |||
Kongo | Kongo | + | + | |||
Korean | Korean | + | + | + | + | |
KoreanHangul | Korean (Hangul) | + | + | + | ||
Koryak | Koryak | + | ||||
Kpelle | Kpelle | + | + | |||
Kumyk | Kumyk | + | + | |||
Kurdish | Kurdish | + | + | |||
Lak | Lak | + | ||||
Lappish | Sami (Lappish) | + | + | |||
Latin | Latin | + | + | + | + | |
Latvian | Latvian | + | + | + | + | |
LatvianGothic | Latvian language written in Gothic script | + | ||||
Lezgin | Lezgin | + | ||||
Lithuanian | Lithuanian | + | + | + | + | |
Luba | Luba | + | + | |||
Macedonian | Macedonian | + | ||||
Malagasy | Malagasy | + | + | |||
Malay | Malay | + | ||||
Malinke | Malinke | + | + | |||
Maltese | Maltese | + | ||||
Mansi | Mansi | + | ||||
Maori | Maori | + | + | |||
Mathematical | Mathematical | + | ||||
Mari | Mari | + | ||||
Maya | Maya | + | + | |||
Miao | Miao | + | + | |||
Minankabaw | Minangkabau | + | + | |||
Mohawk | Mohawk | + | + | |||
Mongol | Mongol | + | + | |||
Mordvin | Mordvin | + | + | |||
Nahuatl | Nahuatl | + | + | |||
Nenets | Nenets | + | + | |||
Nivkh | Nivkh | + | + | |||
Nogay | Nogay | + | + | |||
Norwegian | NorwegianNynorsk and NorwegianBokmal | + | + | + | + | + |
NorwegianBokmal | Norwegian (Bokmal) | + | + | + | + | + |
NorwegianNynorsk | Norwegian (Nynorsk) | + | + | + | + | + |
Nyanja | Nyanja | + | + | |||
Occidental | Occidental | + | ||||
OcrA | For OCR-A text type | + | ||||
OcrB | For OCR-B text type | + | ||||
Ojibway | Ojibway | + | + | |||
OldEnglish | Old English | + | + | + | + | |
OldFrench | Old French | + | + | + | + | |
OldGerman | Old German | + | + | + | + | |
OldItalian | Old Italian | + | + | + | + | |
OldSlavonic | Old Slavonic | + | ||||
OldSpanish | Old Spanish | + | + | + | + | |
Ossetic | Ossetian | + | ||||
Papiamento | Papiamento | + | + | |||
Pascal | Pascal programming language | + | ||||
PidginEnglish | Tok Pisin | + | + | |||
Polish | Polish | + | + | + | + | + |
PortugueseBrazilian | Portuguese (Brazil) | + | + | + | + | + |
PortugueseStandard | Portuguese (Portugal) | + | + | + | + | + |
Provencal | Provencal | + | ||||
Quechua | Quechua | + | + | |||
RhaetoRomanic | Rhaeto-Romanic | + | + | |||
Romanian | Romanian | + | + | + | + | |
RomanianMoldavia | Romanian (Moldavia) | + | + | |||
Romany | Romany | + | + | |||
Ruanda | Ruanda | + | + | |||
Rundi | Rundi | + | + | |||
RussianOldSpelling | Russian (old spelling) | + | + | + | ||
Russian | Russian | + | + | + | + | + |
RussianWithAccent | Russian (with accents marking stress position) | + | + | + | ||
Samoan | Samoan | + | + | |||
Selkup | Selkup | + | + | |||
SerbianCyrillic | Serbian (Cyrillic) | + | + | |||
SerbianLatin | Serbian (Latin) | + | + | |||
Shona | Shona | + | ||||
Sioux | Sioux (Dakota) | + | + | |||
Slovak | Slovak | + | + | + | + | |
Slovenian | Slovenian | + | + | + | + | |
Somali | Somali | + | + | |||
Sorbian | Sorbian | + | ||||
Sotho | Sotho | + | + | |||
Spanish | Spanish | + | + | + | + | + |
Sunda | Sunda | + | ||||
Swahili | Swahili | + | + | |||
Swazi | Swazi | + | + | |||
Swedish | Swedish | + | + | + | + | + |
Tabassaran | Tabassaran | + | ||||
Tagalog | Tagalog | + | + | |||
Tahitian | Tahitian | + | + | |||
Tajik | Tajik | + | + | |||
Tatar | Tatar | + | + | + | ||
Thai | Thai | + | + | + | ||
Tinpo | Jingpo | + | + | |||
Tongan | Tongan | + | + | |||
Tswana | Tswana | + | + | |||
Tun | Tun | + | + | |||
Turkish | Turkish | + | + | + | + | + |
Turkmen | Turkmen | + | ||||
TurkmenLatin | Turkmen (Latin) | + | + | |||
Tuvin | Tuvan | + | + | |||
Udmurt | Udmurt | + | ||||
UighurCyrillic | Uighur (Cyrillic) | + | ||||
UighurLatin | Uighur (Latin) | + | + | |||
Ukrainian | Ukrainian | + | + | + | + | + |
UzbekCyrillic | Uzbek (Cyrillic) | + | ||||
UzbekLatin | Uzbek (Latin) | + | + | |||
Vietnamese | Vietnamese | + | + | + | ||
Visayan | Cebuano | + | + | |||
Welsh | Welsh | + | ||||
Wolof | Wolof | + | + | |||
Xhosa | Xhosa | + | + | |||
Yakut | Yakut | + | ||||
Yiddish | Yiddish | **** | ||||
Zapotec | Zapotec | + | + | |||
Zulu | Zulu | + |
* The classifier which uses only image characteristics can be used for documents in any language. The text-based classifiers (ClassifierTypeEnum::CT_Combined, ClassifierTypeEnum::CT_Text) are only available for recognized documents in languages which have full dictionary support.
** Arabic ICR is not supported. However, handprinted Arabic digits can be recognized. See Recognizing Handprinted Arabic Digits.
***The Nuskhuri and Mtavruli characters are recognized separately from each other, but both types of the characters are saved in the Unicode strings for Nuskhuri.
****A few standard characters (veys בֿ, pasekh alef אַ, komets alef אָ, pasekh tsvey yudn ײַ, melupm vov וּ) are not supported in the predefined Yiddish language. To recognize these characters you should create a new custom language and add these characters to it using the LetterSet property of the TextLanguage object (see Working with Languages), then set the new language as recognition languageand use scenario described in Recognizing with Training and Training to recognize ligatures.
See also
9/17/2024 3:14:40 PM