Classification Training Set conversion

The converted Classification Set no longer separates each view into its own folder. Instead, a document subset is created for each class and the old hierarchy is removed. In the old training hierarchy, if a separate folder for each view and class per view is present, they are merged into a single document subset in the new document set format. Each different view that has one or more training documents is converted into its own document subset with the same name. This means that if your project has both content and layout classification and each of these classification types has training documents, a document subset is created for each view when the project is converted.

In the following image, the old training set for classification documents called Learn lists different training folders based on view, class, and subclass. This hierarchy was needed in order to keep things organized and class specific. After conversion, this hierarchy is no longer needed because the class or subclass is encoded in the document itself. This is visible in the blue highlighted document and its class name.

The top-level view folders for Content, Layout, and Text are converted into document subsets with the same name in the Classification Set. These subsets contain all of the documents that used to be separated by class or subclass, and their class is now displayed as part of the document information.


An image showing the old training set hierarchy and the new document set and document subset structure