Fuzzy Searches

The Kofax Search and Matching Server provides a fuzzy index for structured data that is imported from a relational database. The fuzzy index is created on the server based on a relational database and provides a very flexible and efficient way to "search and match" for specific data during document transformation with Kofax Transformation Modules. The fuzzy index allows non-exact searching at the character level and word level. The non-exact search at the character level is typically required in case of OCR errors, where single characters of a word are missing or wrong.

The non-exact search at the word level allows using the fuzzy index in a search dialog, where the user only needs to type in one or two words to find a specific record. This allows a user to run a fuzzy search in a large database with several million records on any column that was included in the fuzzy index.

It is also possible to search for a specific record using multiple words. The result is then the record with the best match where most of the words in the query can be matched with the record. For example, this is used in the Database Locator of Kofax Transformation Modules where the complete content of a page is used as the query. In this specific case, the query contains many more words than in the record that is being searched. A typical use case for this scenario are invoices that are matched against a vendor database to find the vendor that sends the invoice or mailroom documents that are matched against a customer database to identify the customer that wrote the letter or who is referenced on the letter.

Finally, it is also possible to search for a record and require that the query is identical to the matched record. For example, this feature is used in the Database Evaluator of Kofax Transformation Modules and allows matching several fields with potential OCR errors against a set of columns in a fuzzy index.