Language Selection

When the program performs conversions, it can handle over one hundred languages that use the Latin, Greek and Russian alphabets. When converting pages with a text layer, the language selection is not usually so important. It makes a real difference when you have requested OCR to run: to handle image-only pages or text with non-standard encoding.

 

Here is a full list of the supported languages. Those with dictionary support are shown in bold:

Afrikaans, Albanian, Aymara, Basque, Bemba, Blackfoot, Breton, Bugotu, Bulgarian, Byelorussian, Catalan, Chamorro, Chechen, Corsican, Croatian, Crow, Czech, Danish, Dutch, English, Esperanto, Estonian, Faroese, Fijian, Finnish, French, Frisian, Friulian, Gaelic (Irish), Gaelic (Scottish), Galician, Ganda, German, Greek, Guarani, Hani, Hawaiian, Hungarian, Icelandic, Ido, Indonesian, Interlingua, Inuit, Italian, Kabardian, Kasub, Kawa, Kikuyu, Kongo, Kpelle, Kurdish, Latin, Latvian, Lithuanian, Luba, Luxembourgian, Malagasy, Malay, Malinke, Maltese, Maori, Mayan, Miao, Minankabaw, Mohawk, Moldavian, Nahuatl, Norwegian, Nyanja, Occidental, Ojibway, Papiamento, Pidgin English, Polish, Portuguese (Standard), Portuguese (Brazilian), Provencal, Quechua, Rhaetic, Romanian, Romany, Ruanda, Rundi, Russian, Sami, Sami Lule, Sami Northern, Sami Southern, Samoan, Sardinian, Serbian (Cyrillic), Serbian (Latin), Shona, Sioux, Slovak, Slovenian, Somali, Sorbian (Wend), Sotho, Spanish, Sundanese, Swahili, Swazi, Swedish, Tagalog, Tahitian, Tongan, Tswana, Tun, Turkish, Ukrainian, Visayan, Welsh, Wolof, Xhosa, Zapotec, Zulu, Japanese, Chinese (Simplfied), Chinese (Traditional) and Korean.

To achieve the best OCR accuracy, in the OCR Settings dialog box, select only the languages your document contains. Multiple language selection is allowed with the following limitations:

Tip

Select the Automatic language detection check box instead, if your document exceeds  these limitations.

Click Deselect All to reset the list and start over your selection.

 

The Automatic language detection option is designed for unattended processing. It analyses the content of each source document and assigns one language to each full page, choosing from those that appear in bold in the list above. When automatic detection is set, no other language choice is possible.

Tip

The Automatic language detection option is not designed to identify and separate different languages within a single page. The program is able to do this: manually select all the languages known to be in the document and all text in the document will be assigned to one of these languages, based on an internal detection facility.