Language selection

When the program performs conversions, it can handle over one hundred languages that use the Latin, Greek and Russian alphabets. When converting pages with a text layer, the language selection is not usually so important. It makes a real difference when you have requested OCR to run: to handle image-only pages or text with non-standard encoding.

Here is a full list of the supported languages. Those with dictionary support are shown in bold:

Afrikaans, Albanian, Aymara, Basque, Bemba, Blackfoot, Breton, Bugotu, Bulgarian, Byelorussian, Catalan, Chamorro, Chechen, Corsican, Croatian, Crow, Czech, Danish, Dutch, English, Esperanto, Estonian, Faroese, Fijian, Finnish, French, Frisian, Friulian, Gaelic (Irish), Gaelic (Scottish), Galician, Ganda, German, Greek, Guarani, Hani, Hawaiian, Hungarian, Icelandic, Ido, Indonesian, Interlingua, Inuit, Italian, Kabardian, Kasub, Kawa, Kikuyu, Kongo, Kpelle, Kurdish, Latin, Latvian, Lithuanian, Luba, Luxembourgian, Malagasy, Malay, Malinke, Maltese, Maori, Mayan, Miao, Minankabaw, Mohawk, Moldavian, Nahuatl, Norwegian, Nyanja, Occidental, Ojibway, Papiamento, Pidgin English, Polish, Portuguese (Standard), Portuguese (Brazilian), Provencal, Quechua, Rhaetic, Romanian, Romany, Ruanda, Rundi, Russian, Sami, Sami Lule, Sami Northern, Sami Southern, Samoan, Sardinian, Serbian (Cyrillic), Serbian (Latin), Shona, Sioux, Slovak, Slovenian, Somali, Sorbian (Wend), Sotho, Spanish, Sundanese, Swahili, Swazi, Swedish, Tagalog, Tahitian, Tongan, Tswana, Tun, Turkish, Ukrainian, Visayan, Welsh, Wolof, Xhosa, Zapotec, Zulu, Japanese, Chinese (Simplfied), Chinese (Traditional) and Korean.

To achieve the best OCR accuracy, in the OCR Settings dialog box, select only the languages your document contains. Multiple language selection is allowed with the following limitations:

  • Select either only a single Asian language or one or more languages using the Latin or Cyrillic alphabet.

  • Asian cannot be mixed with other languages.

  • Select the Automatic language detection check box instead, if your document exceeds these limitations.

  • Click Deselect All to reset the list and start over your selection.

The Automatic language detection option is designed for unattended processing. It analyses the content of each source document and assigns one language to each full page, selecting from those that appear in bold in the list above. When automatic detection is set, no other language choice is possible.

Note The Automatic language detection option is not designed to identify and separate different languages within a single page. The program is able to do this: manually select all the languages known to be in the document and all text in the document will be assigned to one of these languages, based on an internal detection facility.