Sender identification: Overview

The ability to automatically identify a document with its sender is a key factor to processing invoices successfully and efficiently.

To facilitate this in Kofax ReadSoft Entrance, the information that is found on a document to identify it is included in the process log by default to provide input on why documents are/are not identified properly.

It is also possible now to have Kofax ReadSoft Entrance change the identifiers automatically when the operator selects them in Verify. In this way, the document may not need to be sent to Optimize for manual optimization.

The identification process

Kofax ReadSoft Entrance tries first to match each incoming document with an existing document definition. This is done using heuristics to reduce the number of document definitions that need to be checked. The results are then listed in the process log (if used):

  • If no definition candidates are found, Kofax ReadSoft Entrance does not continue to look for identifiers, and the document is considered unidentified.
  • If unmistakable identifiers are found on a single definition candidate, it is connected with the document.
  • If definition candidates are found, but unambiguous identifiers cannot be matched for a single definition, the document cannot be connected with a definition, and it is considered unidentified (matching definitions are listed in the process log providing an indication as to how well the heuristics are working). This includes situations when AND is used in the identification settings (meaning that more than one identifier is required for a definition to match).

If a document cannot be connected to a document definition, Interpret continues to search the document for sender information connected with master data if:

The SupplierNumber, CorporateGroupId, and Location fields are currently used as the identifiers needed to qualify a unique match with a sender in master data. Consequently, these fields must be included in the master data, and the master data must be imported before importing EHIX files with document definitions that include these fields (in order to be able to map them).

The way in which senders are identified is different for image based and XML documents:

How senders are identified for image based documents

Kofax ReadSoft Entrance takes all of the text in the document and compares it to the following fields in the sender registry to find a match. The fields are searched in this specific order:

  • VAT registration number (shown in the Sender dialog)
  • Tax number 1 (shown in the Senders dialog)
  • Tax number 2  (shown in the Senders dialog)
  • Bank account number (shown in the View sender dialog)
  • IBAN (shown in the View sender dialog)
  • Telephone number (shown in the View sender dialog)
  • Fax number (shown in the View sender dialog)

If an exact match is found using these fields, the sender is considered to have been identified, and the program stops searching.

If an exact match is not found using these fields, the program continues to look at the name and address fields. If a match is found for a single candidate, the sender is considered to have been identified. However, if an exact match is still not found using the additional information, the sender cannot be identified.

If multiple candidates are found, the search will only continue if the candidates are considered to be the same company. The set of candidates is then narrowed by successively intersecting matches found on Name, Address (combined field), Street, City, Postal code, and Bank code. When multiple candidates are found with the same name, one of them is selected as a match, but all matches are displayed in Verify for the operator to pick from.

When the sender is identified successfully:

  • The document is automatically connected to the correct document definition. If no definition is currently found in the system for the sender, a new one will be created in Verify.
  • If invoice sorting is used, the sender's country code is used to connect the document to the correct country profile.

The fields that are used to identify senders for documents can be specified in the [OcrSupplierIdentification] and [OcrSupplierBankIdentification] sections of eiglobal.ini. It is not possible to specify the order, however. See more information for these settings in the INI file help.

How to avoid receivers from being identified as senders

In some situations, receivers can be mistaken as senders on documents during identification (when one subsidiary from the same company sends an invoice to another subsidiary, for example). Before a field from a sender is matched against the OCR data in an document, it is matched against the receiver data. If a match is found there, the field is not used to identify the document since it could be caused by receiver data being present on the document.

To help avoid this:

  • Include receiver information in the receiver registry--corresponding information that is found in the sender registry is automatically not included in the search for senders.
  • Include information in the receiver and sender registries (VAT registration numbers or account numbers, for examples) that allows Kofax ReadSoft Entrance to make the connection between receivers and senders that are very similar to each other. When matching information is found in both places, the corresponding receivers are removed from consideration as senders.

See the [OcrBuyerSuppression] section and CheckIfSupplierIsBuyer flag in the [OcrSupplierIdentification] section in INI file help for more information.

How senders are identified in connection with XML invoices

Sender data registry information is not used to identify senders for XML invoices. XML invoices are identified using the Supplier ID. If it is not found, the program tries to find the sender using the Sender name, number, and description.

In addition to which fields are used, the order in which they are used can be specified in the [SupplierIdentification] section in eiglobal.ini. Buyer information can also be used for identification using the [BuyerIdentification] section in eiglobal.ini. See more information for these respective settings in the INI file help.