Advanced OCR Output Format Window - Rich Text Format

Use this window to control the output format for the rich text format file generated by the Kofax Advanced OCR Full Text recognition engine.

Output format

Changing the output format may make other options available. The settings of disabled options are retained, so that if you return to that format, the most recent settings are still used.

You can select an output format from this list:

  • Plain Text (.txt)

  • Rich Text Format (.rtf)

  • HTML (.mht)

  • Microsoft Word (*.doc)

  • Comma-Separated Values (*.csv)

  • Microsoft Excel (*.xls)

  • Microsoft Word 2007 and later (*.docx)

  • Microsoft Excel 2007 and later (*.xlsx)

Page Layout

Select the page layout characteristics to be used when exporting to the output format. You can select from the following options:

  • Full-Page Layout: Document layout is retained in full.

  • Retain Paragraphs and Fonts: Recognized text is formatted into a single column. Paragraphs, font types, and font sizes are retained.

  • Retain Paragraphs Only: Recognized text is formatted into a single column. Frames are not used. Paragraphs are retained; however, font types and sizes are not retained.

Text Settings

Select the text attributes (Bold, Italic, Underline) to retain when the recognized data is saved to the output file. For example, to retain characters that are bold in the original document, select the Bold setting. Unselected text attributes are ignored.

Suppress line breaks

Select this check box if you want line breaks in the original document to be suppressed (discarded) when the recognized data is saved. If not, the line breaks are retained.

Use page break as page separator

Select this check box when you want page breaks in the original document to be used as page separators when the recognized data is saved. If not, the page breaks are ignored.

Retain text color

Select this check box if you want the color of the text in the original document to be retained when the recognized data is saved. If not, the original color is ignored.

Picture Settings

Use these settings to manage pictures in the output file.

Remove pictures

Select this check box if you want any pictures that belong to a page removed from the output file.

Resolution

Specify the original resolution of the images to be used. You can select from among the following output resolutions in dots per inch:

  • 72

  • 96

  • 120

  • 200

  • 240

  • 300

  • 360

  • 400

  • 600

Resolution can only be reduced, not increased. For example, if the original image resolution of the scanned page is 200 dpi and the resolution combo box is set to 300, the image resolution on the output file is 200 dpi and not 300 dpi.