Exclude training document sets from database

It is possible to exclude training samples from the database because of security concerns, or concerns about the size of the documents in the database or downloading the project for editing.

You can exclude a training set from the database by following these steps:

  1. Open a project.
  2. Open the temporary location where your project is stored on disk.

    Right-click on any training document and then select Open in Windows Explorer from the context menu.

  3. Copy the required folders (ClassificationTraining or ExtractionTraining) to a safe location.

    A persistent UNC path that is available to all project developers is recommended.

  4. Open the copied folders as new Test Sets from their new location.
  5. For the newly created Test Sets, right-click and choose the following option from the context menu as appropriate.
    • Use as Classification Training Set

    • Use as Extraction Trainig Set

    This swaps the original training sets so that they are normal test sets, and the new test sets are now the training sets.

    The original training sets, that are now test sets, are still stored and synchronized with the database. You can change this by following these steps:

    1. In one the old training sets that are now test sets, click on the <All Documents> subset.
    2. Delete all documents.

      This removed the documents from disk and when the project is saved, this is synchronized with the database so that there is a blank training set in the database.

    3. Repeat for the other training set if needed.

    Now, the training sets are stored outside of the database. You can now train and maintain your project as needed without worrying about storing training documents in the database.