Dictionary Options window

Use this window to modify a dictionary file using the following options:

Referenced import file (text or csv file)

Select one of the following reference file locations:

  • Filesystem.

    Browse to the desired location of a dictionary. The import process starts automatically when the window is closed, and a message box is displayed that counts the number of imported database lines.

    Note One million lines with two columns takes about 1 minute to import.
  • Web.

    Type the URL for your local fuzzy database file.

    Click Test to ensure that the connection to the specified URL is available.

    Provide a User Name and Password if authentication is required.

Automatic update from import file

When content is processed through Kofax TotalAgility - Transformation Server, it is possible to check the source database to see if any changes are made and copy them to your project. This option is cleared by default.

Select this option if you want to automatically update the imported dictionary the next time the Transformation Server runs if the source file changes.

This is especially useful when using a dictionary that is updated regularly and is located on a network. Because of this, only those dictionary changes are available at the time that the Transformation Server was last run.

Note To use the project file on another computer, you need to make certain that the dictionary remains accessible via the same path. For example, you can copy the dictionary file to the same path on the other computer. Alternatively, you can copy the dictionary file to the other computer and then re-select it.

Select this option to synchronize your imported dictionary file with its source dictionary whenever content is opened by one of the following modules.

  • Transformation Server

  • Validation

  • Correction

This means that any changes in the source dictionary are reflected in your imported dictionary file when any of the above modules is run. These automatic updates are then available for processing during those modules.

It is recommended to enable this option only if your source dictionary is updated regularly. Otherwise, the unnecessary update checks may hinder performance. This option is cleared by default.

Important This automatic update feature works slightly different for the Thin Clients. To ensure that combo boxes populated by dictionaries are regularly synchronized, an additional Thin Clients web.config setting is required to replicate the above behavior. Without this setting, a combo box is filled once and used as long as that project is cashed. This means that combo boxes are updated once for multiple batches using the same project. For more information on the new ForceDictionaryUpdate web.config option, see the Thin Client Server Installation Guide.

Dictionaries referenced by validation rules are successfully updated, regardless of the web.config option.

Word list

The word list contains the entries of the imported csv or text file. When a field delimiter is available and The import file contains auto replacement values is selected, an additional Replacement Text column is displayed. The contents of the file are then displayed as a list of dictionary words and replacement text items.

Import Options

This group has the following options:

Ignore Case

Select this option to convert all search and lookup strings to lower case, effectively ignoring case.This option is selected by default.

Additional delimiter

Use this option if fields in the database contain compound words like Diagon-Alley, but the search should consider the single words, additional delimiters can be specified. The search string is split to single terms based on the delimiters. In this example, if a hyphen is used as the delimiter, then Diagon and Alley are searched and evaluated separately.

Note The delimiters should correspond to the delimiters that are defined for recognition, otherwise recognition may not separate the words correctly and then the separately searched words are not found or, if found, the confidence is lower.
Important The value for this option is set to " -," (SPACE, HYPHEN, and COMMA) by default. The SPACE delimiter is mandatory as this is how words are typically separated. If you remove this, the dictionary may no longer work when used in format locators.
Characters to delete

Use this option if you want to filter unwanted characters from the input record. The value for this option is set to empty by default.

Auto Replacement Text

This group has the following options:

The import file contains auto replacement values

Select this option if the imported file contains text that can be used to replace extracted text items. For example, if January is found in the document, the dictionary can be used to replace it with the numeral 1. This makes it possible to take all possible variations of January and convert them to the same replacement value (1). This option is cleared by default.

Field delimiter

Type the character that separates the dictionary values from the auto replacement values. The value for this option is set to ; (SEMICOLON) by default.

Definitions for the buttons at the bottom of this window can be found in Common Transformation Designer Buttons.