Zone handling module

The zone is a rectangular area or the union of specifically located rectangular areas in the page. The upper limit of its dimensions is full page size. It contains a feature that is interesting to the user. The union of rectangles must have a pizza box shape. It means the top of each rectangle in the union must touch the bottom of the rectangle above it. A pizza box-shaped zone is a compound and an irregular zone.

The image data covered by each zone is handled and processed separately, according to zone-specific parameters.

This module can handle two types of zones in separate zone lists of each page: user zones and OCR zones. Zones can be added to the proper zone list of any given HPAGE in the following ways:

  • OCR zones: add zones automatically (auto-zoning)

  • User zones:

    • add zones manually (by specifying the zone coordinates and attributes)

    • add zones from a zone file (created by the user through API functions)

    • add zones from a template library (created by Form Template Editor)

You can start auto-zoning directly, or wait for it to run automatically at the beginning of the recognition process provided that there are no OCR zones already defined.

You can also modify the method and performance of auto-zoning through different settings. When you use user-zones, you can specify whether to have auto-zoning also run or not. Individual user zones are transformed into one or more OCR zones prior to recognition. Recognition is always carried out on OCR zones, and the post-processing can also change the zones.

Based on their content, zones can be classified as follows:

  • Flowing text (horizontal or vertical with three rotations)

  • Table (handled by the Table Recognition Module)

  • Graphics (no recognition performed)

  • Form (handled by the Form Recognition Module)