Trainable Document Separation (TDS)

Document_BeforeSeparatePages( _
    ByVal pXDoc As CscXDocument, _
    ByRef bSkip As Boolean
    )

Introducing event of the document separation process, here the separation of the given document pXDoc can be skipped completely.

If the separation process proceeds first the classification of all pages is performed that can raise an additional event:

Document_XDocPageRotated( _
    ByVal RotationBy As CASCADELib.CscAutoRotation, _
    ByVal pXDoc As CASCADELib.CscXDocument, _
    ByVal PageNr As Long, _
    ByVal Rotation As CASCADELib.CscXDocRotationTypeEnum, _
    ByRef bCancel As Boolean
    )

If a page cannot be classified it is rotated stepwise by 90° clockwise and classification is re-executed. If the classification is successful for a rotation step, the rotation event is fired. If this is canceled from script by setting bCancel to TRUE the remaining rotation directions are applied and classification is executed for the page. This is done for all rotation directions where the page can either not be classified or the rotation is canceled by script. The parameter RotationBy is set to CscAutoRotationBySeparation.

Then all not classified pages are checked if content classification is required and where OCR has to be prepared. This may raise another XDocPageRotated event with the parameter RotationBy set to CscAutoRotationByOCR. This rotation reflects a rotation that is suggested by the OCR. If this event is canceled the OCR is re-executed without rotation.

Note If the Document_XDocRotated event is executed in Project Builder you cannot access the CscXFolder object. To ensure that any implementation will not terminate the application abnormally, you can evaluate the script execution mode (Project.ScriptExecutionMode=CscScriptModeServerDesign).

With all this page classification information the following event is executed:

Document_BeforeTDS( _
    ByVal pXDoc As CASCADELib.CscXDocument, _
    ByRef bSkip As Boolean
    )

Each document pXDoc that is processed by TDS is passed to this event with the PageClassificationConfidence on page level that is used by TDS. This information can be modified prior to the execution of the TDS algorithm that sets the SplitPage flag of the pages before those the document is cut.

Document_AfterSeparatePages( _
    ByVal pXDoc As CscXDocument
    )

This is the finalizing event of the trainable document separation for the given document pXDoc. Here the complete document before applying the document separation is given again, theoretically the pXDoc.CDoc.Pages(...).SplitPage parameter can be modified to flag the pages before the document is cut.

The general chronology of the document separation events is shown in the following figure:


A visual representation of Trainable Document Separation (TDS) events.