Add documents to Table Extraction Set

It is necessary to add training documents to the Table Extraction Set in order to teach your project which table is which. Like all training documents, these documents should be ideal, yet typical examples of the documents you expect to process. If the document images contain lots of noise or distortions, poor extraction results are likely.

For the best results, ensure that all documents used for table extraction are already classified and recognized before you add them to the Table Extraction Set.

You can add a document to the Table Extraction Set for the selected class by following these steps:

  1. Open a Test Set that contains the documents you want to use for training, if not already open.

    For the best results, ensure that these documents are classified and have recognition results.

  2. Select a class in the Project Tree.

    Ensure that the Enable Table Detection setting is enabled. If it is not enabled, you are prompted to enabled it for this class.

  3. Right-click on one or more documents in the Test Set and select Add to Training Set of Selected Class (Table Extraction).

    The documents are added to the Table Extraction Set.

    When documents are first added to this document set, they are excluded from training. This is indicated with the Exclude from Training icon in the Use column. In order for documents to be included in the table training, at least one table label is required per document. Once table labels are added to these documents, the icon changes to the Include for Training icon.

  4. Repeat these steps for other documents and classes.

What next?

Since table detection can find more than one table on a document, it is necessary to teach your project which tables on a document are of interest by editing a table training document.

Related topics: