Word separation characters

The "Field delimiter" option for fuzzy databases and some page recognition profiles enables you to specify what characters can be used to identify compound words. When one of the separation characters is encountered, it is recognized and a new word starts with the following character.

For example, if a document contains a compound word like Diagon-Alley, and you want the search to consider this compound word as two separate words, word separation characters can be specified. In this example, if a hyphen (-) is used then diagon and alley are searched and evaluated separately.

By default, fuzzy databases have a hyphen (-) and a comma (,) as their default values. Page recognition profiles however, have different default values. The following table lists the default word separation characters for each of the page recognition profiles:

Table 1. Default Page Recognition Profile Word Separation Characters
Recognition Profile Default Word Separation Characters

FineReader 12.2 page recognition

/:()-#

RecoStar 7.8 page recognition

/:()-

Cursive page recognition *

  • Check Reader 11.1

  • Field Reader 8.1

  • Document Reader 8.1

N/A

Mixed Print page recognition **

/:()-#

* These recognition engines are not installed by default and require additional licensing.

** If no profile is chosen as input profile for machine print.