Format Locator

Format Locator icon The format locator works with format definitions such as pattern matching (regular expressions and simple expressions) and advanced algorithms (Levenshtein and trigrams). The format definitions in partnership with dictionaries and keywords are used to extract data from documents, without the need to define zones. The locator runs on a full or partial page read of the document to extract the data using searches that are specific to the data, not the document layout. The locator evaluates the found alternatives and the data output.

You can add more than one format definition on a single format locator to increase flexibility. This is useful because the format locator searches a document and collects all items that match any of the format definitions. Depending on whether a format matches an entire word or just part of it, a confidence level is assigned. For the calculation of the final confidence, this extraction method also takes keywords into account.

To configure a format locator, add it to the selected class. Open a document set that contains documents for the class before modifying the locator properties. You can then configure the Format Locator by opening the locator properties.

Manage a Format Locator as follows from the General tab:

  • Use results from another locator

  • Use OCR substitution for regular expressions

  • Configure settings for non-regular expressions

Manage a Format Locator as follows from the Format Definitions tab:

Manage a Format Locator as follows from the Evaluation Settings tab:

The Properties of Format Locator window has the following tabs: