Process documents
The processing of the documents is mainly done in two steps. First the document is tried to be classified and then the settings of that specific class are applied to the document to retrieve the document data.
Each processing of a document is introduced by a Document_BeforeProcessXDoc in the project script, because the incoming document is completely unknown. This event can be used for controlling the OCR as shown in the following example.
If project fields are defined their extraction events are executed first, then the classification events are following which are also part of the project script. The extraction of the project fields can be used to classify the document by these extraction results, for example, a bar code result is used to classify documents to a specific class.
For a classified document the document extraction is performed. This means that all locator methods are executed and by their assignment to the fields the document is getting its field results. The extraction comes along with extraction events that are following the defined class hierarchy, they are considering the field and locator inheritance. When the extracted document belongs to a child class, the extraction events for the inherited locators and for the inherited fields are also fired for all parent classes.