Import extraction new samples

Once you view the extraction new samples in the New Samples document set, you can add them to your Extraction Set if these documents can improve your extraction results. If a sample is not suitable for extraction training, remove it from the New Samples Extraction document subset.

You can import one or more extraction new samples by following these steps:

  1. Open the Documents window if it is not already open.
  2. Select the New Samples document subset.

    The document set expands, the Problems document subset is selected by default, and the problem documents are displayed.

  3. Select the Extraction document subset.

    A list of extraction samples is displayed.

  4. Because all extraction samples are imported at once, the best practice is to review and remove any documents that are not suitable for extraction training.
  5. On the Documents toolbar, click Import Documents from Extraction Online Learning Import Documents for Extraction Online Learning icon

    The Import Extraction Online Learning Data window is displayed and shows the current path to the online learning files for the project.

  6. Enter a name in the Import into training subset field.

    The best practice is to create a new document subset for each imported set of new sample documents. This is because it ensures that you are able to differentiate, test, and benchmark your training set to determine whether a specific set of new samples improves or hinders your extraction results.

  7. Click OK to save your settings and close the Import Extraction Online Learning Data window.

    The Extraction subset in the New Samples database is emptied and all of the extraction samples and they are moved to the Extraction Set under the document subset specified in the previous step.

  8. Train your project for extraction.
  9. Optionally, perform an extraction benchmark and compare it to a previous benchmark that does not include the newly added sample documents.