Custom document test sets

In addition to the preconfigured document sets that are created automatically for each project, you can also add custom test set. Once a custom test set is added to your project, a reference to that document set is available from the "Recent Documents" list each time the project is opened in the future.

You can also permanently keep a document set reference attached to your project. This means that the document set is always available in the Documents window when the project is opened, and throughout the lifetime of your project. If at any time, the document set is no longer needed, it can be removed from your project.

A custom test set can contain a mixture of source files, including images (*.tif, *.jpg, *.png), text files (*.txt), PDF files (*.pdf), as well as XDocuments (*.xdc). Document sets that contain these types of files are typically used for extraction, classification, and separation training and testing, extraction training, and various other types of benchmarks. These document sets can contain a directory hierarchy, but are stored as individual documents in a directory structure that matches their physical layout.

It is also possible to create a document set that contains a set of documents that are stored in an XFolder (*.xfd).

This contains the document set hierarchy, and is used to test batch events, batch editing, autofoldering, and other structural testing.

These custom test sets can be used as additional training sets, test sets, and even benchmark sets, and are more commonly known as golden files.

When you create a document set, you select the File Type. These file types determine the purpose of a document set.

For example, document sets that are created using an XFolder (Folder.xfd) that has its documents arranged in a hierarchy that matches the project hierarchy for classes and folders is typically used to test foldering, autofoldering, and batch operations.

It is also possible for document sets that use XDocuments, images, text files, or PDFs to be arranged in a hierarchy that matches the project layout. These document sets are typically used to test extraction, classification, separation, as well as benchmarks.

A custom test set can be used for the following purposes:

  • Testing separation, classification, and extraction. The test document set the default document set type when a custom test set is initially added to your project.

  • Generate benchmarks for separation, classification, and extraction. You can convert a test document set into a benchmark document set using the document set shortcut menu.