RecoStar Page Recognition Profile Settings window

Use this window to set the properties for RecoStar full text recognition profile.

Countries

Select one or more countries from the list to determine what country and language characters are supported by this recognition profile. The selected countries also correspond to the internal trigram mode and any existing dictionaries.

The default country option is determined by the Kofax Transformation Modules user interface language when the project is first created. For example, if the interface is set to German when the project is created, the default country option is Germany. Changing the application language after the project is created does not affect the default country.

Languages

Only use additional recognition options in a multi-lingual setting when the text in your documents corresponds to the Machine print type and the Alphanumeric content type.

Note The (optional) number in brackets helps you monitoring the number of selected countries and/or languages. This is especially important if your selection is hidden and you need to scroll to see the selection.
General Settings

This group can help improve recognition results with the following options:

Image PreProcessing

Select one of the predefined image processing definitions. The predefined definitions include several combinations of items to be removed before OCR is performed. The value for this option is set to Remove Shading, Dots, Lines, Punch holes by default.

Word separation characters

Use this field to define what characters may separate words. The value for this option is set to /:()-# (forward slash, colon, open and close parentheses, hyphen, pound) by default.

Correct split numbers

Select this option to automatically combine numbers or numeric words that are close together but recognized as separate words. For example, if the engine reads "12" and "00," as two words less than half a space apart, this option results in a single combined word or "12.00". This option is selected by default.

Print Type

This group helps improve recognition results by selecting what print type is expected. Select one of the following:

  • Unknown. This is the default value for this option.

  • Fixed.

  • Hand print.

  • Machine print.

Trigram Mode

Select one of the following trigram modes:

  • Off. This is the default value for this option.

  • Check.

  • Repair.

Logical context

Select this option if you want to enable the trigram feature to resolve uncertain characters on the basis of their logical context. This option is selected by default.

The Dictionary group enables you to select a dictionary to help with recognition. Sample dictionaries are provided by Kofax Transformation Modules. Additionally, you can create your own dictionaries. By default, no dictionary is selected. When you click the button to the right, you can browse to the directory, where the dictionaries are stored. Click <None> to remove the dictionary from your recognition profile. The value for this option is set to <None> by default.

Important To be able to use extended sample dictionaries modified in a project created in an earlier Kofax Transformation Modules version, you need to copy these files from <Install Directory>\Program Files\Common Files\Kofax\RecoStar40b\Dict to the ..\RecoStar78\Dict directory.

Definitions for the buttons at the bottom of this window can be found in Common Project Builder Buttons.