Perform an extraction benchmark

Before you can generate extraction benchmarks, you need to open your golden files or a benchmark document set in the Documents window.

After running an extraction benchmark, you might want to change your extraction configuration settings. The following are common mistakes that you should avoid:

  • Lowering the minimum confidence thresholds just to get more green valid fields. You may end up missing invalid data as a result.

  • Increasing the minimum confidence thresholds just to eliminate red incorrect valid fields. This likely does not fix the problem that is causing the incorrect valid fields, only masks it. These types of errors may still occur in production.

  • Modifying recognition profile settings to improve a specific field. This may adversely affect other locator methods and fields.

Instead, you should concentrate on the following:

  • Confirm that the recognition engine you are using is the most effective for your documents.

  • Improve your classification results so they do not negatively affect your extraction results.

  • Improve the regular expressions used by your Format Locators.

  • Review and update zone settings such as image cleanup, anchoring and registration for Advanced Zone Locators.

  • Improve or update the knowledge bases used for any group locators.

  • Review and improve any scripts used at both project and class level.

  • Verify that all databases are up-to-date and that your queries return the correct data.

  • Review and improve Table Locator settings for regular and trainable tables.

  • Update any other locator-specific settings that could improve extraction results.

You can perform an extraction benchmark by following these steps:

  1. Open the Documents window if it is not already open.
  2. If a different view is in use, switch to the List view Documents Window - Flat View icon.
  3. Open the golden files you created for extraction testing.

    If you do not have a set of golden files, create a set for extraction or an all-purpose set.

  4. On the Process tab, in the Benchmark group, click Extraction Benchmark Benchmark - Extraction icon and select one of the following options from the submenu:
    • Selected Class Only Benchmark - Extraction Selected Class icon

    • Select Class and its Child Classes Benchmark - Extraction Selected Class and Children icon

    • All Classes Benchmark - Extraction All Classes icon

    The Extraction Benchmark window is displayed and the selected extraction benchmark runs.

  5. If you made changes outside of this window, you can run another benchmark by clicking Start.

    The extraction benchmark results are displayed in the Summary and Details tables.

  6. Optionally, click Save to store the benchmark results for comparison later.
  7. Optionally, Export your benchmark results as a .csv file that can be opened in another application.