Extraction benchmark comparison

Once you save two or more extraction benchmarks, compare them to see if your modifications improve or degrade the extraction quality. The comparison tab is separated into a comparison of the summary statistics as well as a comparison of the detailed statistics.

This enables you to look at the differences between the benchmarks and clearly see where you gained or lost extraction quality.

If you only have two benchmarks saved, these are automatically loaded into the comparison tab. If one is set to be the baseline, it appears on the left while the other appears on the right. If you have multiple benchmarks saved, the baseline is loaded automatically on the left and you select the other benchmark to view from the list.

The Summary table shows the overall results for each field and each extraction result shown in the legend. Each cell contains the result from the non-baseline extraction, and the difference in percentage between baselines is displayed in brackets.

The following color scheme explains the results:

  • If the number of correct valid fields (A) decreases, the field has red highlighting.

  • If the number of incorrect valid fields (D) increases, the field has red highlighting.

  • If the number of correct valid fields (A) increases, the field has green highlighting.

  • If the number of incorrect valid fields (D) decreases, the field has green highlighting.

  • If the values between the benchmark results are the same, the field has no highlighting.

  • Changes in correct invalid fields (B) and incorrect invalid fields (C) are displayed in the cell, but the field has no highlighting.

The Details Statistics Comparison table compares information about each field and document. In each summary column cells (A, B, C, and D), the non-baseline extraction result is shown and the difference in percentage between it and the baseline is displayed in brackets.

The following example shows how modifications made to the extraction settings can lower extraction results when compared with a golden file baseline.


An image that shows the summary statistics comparison for two benchmark files.