Train the Text Content Locator

Before training the Text Content Locator, add Text Content Locator subfields and class fields to your project. Once you finish adding fields, add training documents and train your project.

You can train the Text Content Locator by following these steps:

  1. Open the Project Tree window if it is not already open.
  2. Expand the Project Tree and select the class.
  3. Optionally, view the class contents if they are not already displayed.

    The hidden class contents are displayed.

  4. Open the Documents window if it is not already open.
  5. If a different view is in use, switch to the List view Documents Window - Flat View icon.
  6. Add or open a document set that contains the documents to use to train the Text Content Locator.

    A list of documents is displayed.

  7. Double-click a document.

    The Document Viewer is displayed showing the selected document.

  8. If the document is suitable for training, select the document in the list view of the Documents window and select Train for Extraction Train for Extraction icon from the menu.

    The Edit Document window is displayed showing the first of the selected documents.

  9. In the Edit Document window, select a field and lasso the corresponding content in the document.

    The lassoed data is entered into the field.

  10. Repeat lassoing for each field and when finished, click Add to Training Folder Add to Training Folder icon.

    The document is added to your Extraction Set and the next document in your document set is loaded automatically.

  11. Continue to add training documents until you are ready to test your training set, and Close the Edit Document window.

    For the first iteration of testing, add a minimum of 3 to 5 training documents per class before training and testing your project. Using the smallest number of training documents can return reliable extraction results. Also, these few documents may return adequate results, so you do not need to add more documents.

  12. On the Process Ribbon tab, in the Train group, click Extraction Train - Extraction icon.

    The documents in your training set are trained and a progress bar lets you view the progress.

  13. In your test document set, right-click one or more selected documents, and click Process.

    The document is classified and extracted.

  14. Open the Extraction Results window if it is not already open.

    The extraction results are displayed. Invalid fields have a blue question mark and valid fields have a green check mark.

  15. In the Extraction Results window, view the Text Content Locator results based on your training documents.

    If the results are not satisfactory, add training documents to your Extraction Set by repeating steps 7 through 14.