Process documents

The processing of the documents is mainly done in two steps. First the document is tried to be classified and then the settings of that specific class are applied to the document to retrieve the document data.

Each processing of a document is introduced by a Document_BeforeProcessXDoc in the project script, because the incoming document is completely unknown. This event can be used for controlling the OCR as shown in the following example.

Important You cannot call page recognition from script using the Kofax Clarity recognition engine.

If project fields are defined their extraction events are executed first, then the classification events are following which are also part of the project script. The extraction of the project fields can be used to classify the document by these extraction results, for example, a bar code result is used to classify documents to a specific class.

For a classified document the document extraction is performed. This means that all locator methods are executed and by their assignment to the fields the document is getting its field results. The extraction comes along with extraction events that are following the defined class hierarchy, they are considering the field and locator inheritance. When the extracted document belongs to a child class, the extraction events for the inherited locators and for the inherited fields are also fired for all parent classes.

Important If foldering is enabled and you have folder fields defined you must not change any folder fields for any of the document processing events, such as DocumentValidated. The reason is that multiple documents are processed in parallel and therefore the changes cannot be saved to the root Xfolder object.