Dictionary Options window

Use this window to modify a dictionary file using the following settings:

Referenced import file (text or csv file)

Select one of the following reference file locations:

  • File system.

    Browse to the desired location of a dictionary. The import process starts automatically when the window is closed, and a message box is displayed that counts the number of imported database lines.

    One million lines with two columns takes about 1 minute to import.

  • Web.

    Type the URL for your local fuzzy database file.

    Click Test to ensure that the connection to the specified URL is available.

    Provide a User Name and Password if authentication is required.

Automatic update from import file

When content is processed through Kofax TotalAgility - Transformation Server, it is possible to check the source database to see if any changes are made and copy them to your project.

This setting is cleared by default.

Select this setting if you want to automatically update the imported dictionary the next time the Transformation Server runs if the source file changes.

This is especially useful when using a dictionary that is updated regularly and is located on a network. Because of this, only those dictionary changes are available at the time that the Transformation Server was last run.

To use the project file on another computer, you need to make certain that the dictionary remains accessible via the same path. For example, you can copy the dictionary file to the same path on the other computer. Alternatively, you can copy the dictionary file to the other computer and then re-select it.

Select this setting to synchronize your imported dictionary file with its source dictionary whenever content is updated during runtime.

These automatic updates are then available for processing during those modules.

It is recommended to enable this setting only if your source dictionary is updated regularly. Otherwise, the unnecessary update checks may hinder performance. This setting is cleared by default.

This automatic update feature works slightly different for the Thin Clients. To ensure that combo boxes populated by dictionaries are regularly synchronized, an additional Thin Clients web.config setting is required to replicate the above behavior. Without this setting, a combo box is filled once and used as long as that project is cashed. This means that combo boxes are updated once for multiple batches using the same project. For more information on the new ForceDictionaryUpdate web.config setting, see the Thin Client Server Installation Guide.

Dictionaries referenced by validation rules are successfully updated, regardless of the web.config setting.

Word

The word list contains the entries of the imported csv or text file. When a field delimiter is available and The import file contains auto replacement values is selected, an additional Replacement Text column is displayed. The contents of the file are then displayed as a list of dictionary words and replacement text items.

Import Options

This group has the following settings:

Ignore Case

Select this setting to convert all search and lookup strings to lower case, effectively ignoring case.This setting is selected by default.

Additional delimiter

Use this setting if fields in the database contain compound words like "Diagon-Alley," but the search should consider the single words, additional delimiters can be specified. The search string is split to single terms based on the delimiters. In this example, if a hyphen is used as the delimiter, then "Diagon" and "Alley" are searched and evaluated separately.

The delimiters should correspond to the delimiters that are defined for recognition, otherwise recognition may not separate the words correctly and then the separately searched words are not found or, if found, the confidence is lower.

The value for this setting is set to -, (SPACE, HYPHEN, and COMMA) by default. The SPACE delimiter is mandatory as this is how words are typically separated. If you remove this, the dictionary may no longer work when used in format locators.

Characters to delete

Use this setting if you want to filter unwanted characters from the input record. The value for this setting is set to empty by default.

Auto Replacement Text

This group has the following settings:

The import file contains auto replacement values

Select this setting if the imported file contains text that can be used to replace extracted text items. For example, if January is found in the document, the dictionary can be used to replace it with the numeral 1. This makes it possible to take all possible variations of January and convert them to the same replacement value (1). This setting is cleared by default.

Field delimiter

Type the character that separates the dictionary values from the auto replacement values. The value for this setting is set to ; (semicolon) by default.

Definitions for the buttons at the bottom of this window can be found in Common Transformation Designer Buttons.