Recognition Profiles Window - Enhanced OCR Full Text Engine

Use this window to select settings for the Kofax Enhanced OCR Full Text recognition profile.

Name

Use the list to select a recognition profile. The other settings on the window are refreshed with the settings defined for the selected profile.

Engine

Enhanced OCR Full Text is the default setting.

Languages

This contains a single language, or a list of multiple languages separated by semicolons. The edit box can scroll to display all the selected languages. You cannot select languages here; this is for informative purposes only.

Select button

Use this button to select languages from the Recognition Languages window.

Mark and Spell

Use these settings to specify the minimum level of confidence to accept for character recognition. Characters that do not meet the minimum level are marked with the mark flag.

General

Use this setting to select from three levels of confidence. The default level is Medium. The other choices are Low and High.

A setting of Low indicates a lower level of recognition confidence, which results in fewer mark flags.

A setting of Medium indicates a moderate amount of recognition confidence which results in more mark flags than with the Low setting.

A setting of High indicates that you require a greater degree of recognition confidence, which may result in more mark flags than the other levels.

Specific

Use this setting to specify a precise level of confidence ranging from 0 to 100.

The level value for Low is 75.

The level value for Medium is 85.

The level value for High is 95.

Spell check and Spell check flag

Select the Spell check option to have unrecognized words compared to entries in a dictionary of known values. If they do not match, a spell check flag is inserted in front of the non-matching word. A default dictionary is used corresponding to the currently selected language. You can also specify a custom dictionary by selecting a text file in the Document Class Properties window OCR tab.

Use the Spell check flag to specify a character used to indicate words that are not found in a dictionary. You can specify only a single character. Note that the following occurs when you have enabled and specified a spell check flag:

  • For a single language: All words are flagged when the selected language does not have a dictionary.

  • For multiple languages: Only words not found in the dictionary are flagged. If the selected language does not have a dictionary, then the word is not flagged.

If a word is flagged twice, once with the spell check flag and once with the confidence mark flag, the spell check flag is first, followed immediately by the confidence mark flag.  However, if the two flags are set to the same character (for example ^), both flags are represented by a single character. This is the default behavior.

Advanced button

Click this button to specify advanced recognition settings. The Enhanced OCR Recognition Settings window appears.

General tab

Use this tab to adjust settings related to image resolution, the Zone type, and the Text type.

Resolution enhancement
Select one of the following options to enhance the original image resolution and improve the recognition quality.

Resolution enhancement option Description

No

No resolution enhancement is applied.

Yes

Enhancement is applied with the ratio of 2:0.

Legacy

The ratio of enhancement depends on the resolution of the non-BW (black-and-white) image. If the resolution is 160 DPI or less, the ratio of 2:0 is applied. If the resolution is between 160 and 210 DPI, the ratio of 1:5 is applied.

Standard

The ratio of enhancement is detected dynamically: it depends on the resolution and the detected average character size of the non-BW image. If the resolution exceeds 210 DPI, the enhancement is not applied.

Auto (default)

The ratio of enhancement depends on the current operation setting that can be configured to optimize accuracy or performance.

Zone type
Select one of the following options to define the zone type to be recognized.

Zone type option Description

Text (default)

The recognition engine automatically recognizes the page/zone as text.

Table

The recognition engine automatically recognizes the page/zone as a table. This setting can impact the output of OCR Full Text. For example, for CSV export, the output can be filled with empty strings for the separated values.

Picture

If selected, the recognition engine finds graphical zones automatically.

Single word

The recognition engine limits the result to one word only.

Text type
You can define the text type to be recognized.

Operation tab

You can define the balance between performance and accuracy for the recognition engine.

Output button

Click this button to display the Enhanced OCR Output Format window, which is used to specify output options for a variety of output file types.

Image Cleanup

Profiles
Select an image cleanup profile from the list.
Edit button

To modify an existing image cleanup profile or create a new one, click the Edit button. The Image Cleanup Profiles window appears, and you can specify the type of image cleanup you want to use, along with other advanced settings.

Delete button

Click this button to delete the currently selected profile.  It is not possible to delete profiles that are built in to Kofax Capture.

Script button

If enabled, use this button to assign a recognition script to the selected profile. The Recognition Script window appears, and you can associate a recognition script with the recognition profile.

Test button

This button is not available for this recognition profile.