Regular Expressions

Regular expressions are used to recognize patters with textual data. They evaluate text data and match an expression with the text in the document. In Kofax TotalAgility, regular expressions are used in format locators, validation methods, and formatters, to identify and normalize items on a document.

Regular expressions describe data in an abstract way, and some common examples are listed in the following table:

Table 1. Regular Expression Syntax
Format Description Example Matches Does Not Match

C

One character

a

a

b,A

. (period)

Any character

b.g

bug, bag, big, bbg

bg, baag

\d

Any single digit

a\d

a5, a8, a0

aA, ab, a

c1c2c3

One character out of a set

[abc]

a, b, c

1, 2, d, D, A, ab, bc

[c1-cn]

One character out of a range

[a-z]

b, g, x

1, 2, D, A

? (question mark)

The previous term is optional

x\d?

x, x7, x1

xx, xq

+ (plus sign)

The previous term can be repeated one or more times

\d+

4, 2323, 100

A112, 2b, X

* (asterisk)

The previous term can be repeated zero or more times

x\d*

x6, x, x100

100x, xx

{n}

The previous term can be repeated exactly n times

y{3}

yyy

yy, yyyy

{m, n}

The previous term can be repeated between m and n times

\d{5,9}

12345, 999999999

1234, 999999999999

\

Escape special characters

\$ \\ \- \? \.

$ \ - ? .

!%

()

Group characters

a(\$\$)?b

a$$b, ab

a$b, a$$

(e1|e2)

Choice

(abc|ABC)

abc, ABC

aBC, AbC

\n

Back reference (nth item matched in round brackets needs to be matched again)

(\d)x\1

1x1,2x2,3x3,4x4...

1x2,6x7...

You can find many third-party resources on the internet about regular expressions. In many cases however, extensive knowledge of regular expressions is not needed because Kofax TotalAgility provides a set of commonly used and predefined templates.

Note You can also use dictionaries in regular expressions for format locators. If you know the name of the dictionary, you can edit the input box of the format locator directly by typing the dictionary name as "§ + dictionary name + §." As there is no such symbol on the keyboard, you can generate this by typing Alt + 0167.