Workflows

Workflows are manifested in XML files. These XML files can contain one or more job items. Multiple job items can be sequenced (dependent) or parallel. There are predefined job items implemented by the CSDK and you can create your custom job item using C++ or .NET. WorkflowXMLs can be created by the Designer as predefined workflows, changing by your app only the input and output filenames, or your application can construct them programmatically. The Designer is a good starting point. WorkflowXMLs are executed by Intelligent Workflow Runner (OCRService). IWR is implemented in OCRService.exe / OCRServer.exe out-of-process COM server pair. They must be registered by OCRServer /regserver and OCRService /regserver at admin command prompt or by the Installer. By default, in this case the Interactive user will run the two COM servers. You can register IWR as a service using the /service switch. In this case, the system account will run the processes.

Workflows are structured group of job items. Job items start to run when the parent job item finished. Job items without parent start to run when the workflow is started.

Job items use the native IproPlus interface, and the interface of the document used by the parent job item. Usually the root job item creates the document, however, each job item can create a new document, or modify the document received from its parent.

The API .NET interface is available for developers through the Kofax.OmniPageCSDK.IproPlus.JobService.dll assembly. The API native C++ interface is available for developers through a type library embedded to the OCRService.exe file.

IWR has a very simple program interface. It handles the WorkflowXML execution and management. There is no any dependency between CSDK and API. The Run() method of the interface is asynchronous one. The Apps get the execution results in the Done event. The job ID of the Done event is the same as the job ID of the Run command. OCRService terminates the OCRServer for which the Done event logs a start or runtime error. For parameters providing information on the process, see Parameters of Done event.

If the logging is switched on for OCRService, start errors appear in the log, including each line of the source code of OCRService and OCRServer, where the error is detected.

OCRService part of the IWR is also responsible for scaling-up. It handles multiple OCRServers (CSDK process). The application can control the number of the workers. By default, OCRService starts as many workers as virtual CPUs are available. The OCRService part is also responsible for the OCRServer process life cycle management to achieve 24x7 operation. OCRServer is the host of CSDK binaries.

The OCRService object exports the Run() method, and the properties used for organizing the processing of a workflow. These properties can be set through the OCRService.ini file. The OCRService object creates the OCRService.ini file at the end of the first run, and saves it in the %programdata%\Kofax\OmniPage OCR Service\ folder. If the OCRService.ini file exists, the OCRService object reads its content at the beginning of the process, and overwrites it at the end of the process. Thus, OCRService.ini file have to be modified when there is no any running OCRService process. For the parameters of OCRService.ini, see Parameters of OCRService.ini.

OCRService can be hosted in a Windows Service. You can operate multiple nodes with OCRService to implement an application with scale-out capabilities. In this model, you need to distribute WorkflowXMLs to the OCRService nodes using queues, for instance ServiceBus.