Recognition Profiles Window - Enhanced OCR Full Text Engine
Use this window to select settings for the Kofax Enhanced OCR Full Text recognition profile.
Name
Use the list to select a recognition profile. The other settings on the window are refreshed with the settings defined for the selected profile.
Engine
Enhanced OCR Full Text is the default setting.
Languages
This contains a single language, or a list of multiple languages separated by semicolons. The edit box can scroll to display all the selected languages. You cannot select languages here; this is for informative purposes only.
Select button
Use this button to select languages from the Recognition Languages window.
Mark and Spell
Use these settings to specify the minimum level of confidence to accept for character recognition. Characters that do not meet the minimum level are marked with the mark flag.
- General
-
Use this setting to select from three levels of confidence. The default level is Medium. The other choices are Low and High.
A setting of Low indicates a lower level of recognition confidence, which results in fewer mark flags.
A setting of Medium indicates a moderate amount of recognition confidence which results in more mark flags than with the Low setting.
A setting of High indicates that you require a greater degree of recognition confidence, which may result in more mark flags than the other levels.
- Specific
-
Use this setting to specify a precise level of confidence ranging from 0 to 100.
The level value for Low is 75.
The level value for Medium is 85.
The level value for High is 95.
Spell check and Spell check flag
Select the
Spell check
option to have unrecognized words compared to entries in a dictionary of known
values. If they do not match, a spell check flag is inserted in front of the non-matching word. A default dictionary is used
corresponding to the currently selected language. You can also specify a custom dictionary by selecting a text file in the
Document Class Properties window OCR tab.
Use the
Spell check flag
to specify a character used to indicate words that are not found in a
dictionary. You can specify only a single character. Note that the following occurs when you have enabled and specified a spell
check flag:
-
For a single language: All words are flagged when the selected language does not have a dictionary.
-
For multiple languages: Only words not found in the dictionary are flagged. If the selected language does not have a dictionary, then the word is not flagged.
If a word is flagged twice, once with the spell check flag and once with the confidence mark flag, the spell check flag is first, followed immediately by the confidence mark flag. However, if the two flags are set to the same character (for example ^), both flags are represented by a single character. This is the default behavior.
Advanced button
Click this button to specify advanced recognition settings. The Enhanced OCR Recognition Settings window appears.
General tab
Use this tab to adjust settings related to image resolution, the Zone type, and the Text type.
- Resolution enhancement
- Select one of the following options to enhance the original image resolution and improve the recognition quality.
Resolution enhancement option Description No
No resolution enhancement is applied.
Yes
Enhancement is applied with the ratio of 2:0.
Legacy
The ratio of enhancement depends on the resolution of the non-BW (black-and-white) image. If the resolution is 160 DPI or less, the ratio of 2:0 is applied. If the resolution is between 160 and 210 DPI, the ratio of 1:5 is applied.
Standard
The ratio of enhancement is detected dynamically: it depends on the resolution and the detected average character size of the non-BW image. If the resolution exceeds 210 DPI, the enhancement is not applied.
Auto (default)
The ratio of enhancement depends on the current operation setting that can be configured to optimize accuracy or performance.
- Zone type
- Select one of the following options to define the zone type to be recognized.
Zone type option Description Text (default)
The recognition engine automatically recognizes the page/zone as text.
Table
The recognition engine automatically recognizes the page/zone as a table. This setting can impact the output of OCR Full Text. For example, for CSV export, the output can be filled with empty strings for the separated values.
Picture
If selected, the recognition engine finds graphical zones automatically.
Single word
The recognition engine limits the result to one word only.
- Text type
- You can define the text type to be recognized.
Operation tab
You can define the balance between performance and accuracy for the recognition engine.
Output button
Click this button to display the Enhanced OCR Output Format window, which is used to specify output options for a variety of output file types.
Image Cleanup
- Profiles
- Select an image cleanup profile from the list.
- Edit button
-
To modify an existing image cleanup profile or create a new one, click the Edit button. The Image Cleanup Profiles window appears, and you can specify the type of image cleanup you want to use, along with other advanced settings.
Delete button
Click this button to delete the currently selected profile. It is not possible to delete profiles that are built in to Kofax Capture.
Script button
If enabled, use this button to assign a recognition script to the selected profile. The Recognition Script window appears, and you can associate a recognition script with the recognition profile.
Test button
This button is not available for this recognition profile.