Regular Expression Syntax

The component uses ECMAScript regular expression syntax.

Special characters

Characters Description Matches
. Not newline Any character except line terminators (LF, CR, LS, PS).
\t Tab (HT) A horizontal tab character (same as \u0009).
\n Newline (LF) A newline (line feed) character (same as \u000A).
\v Vertical tab (VT) A vertical tab character (same as \u000B).
\f Form feed (FF) A form feed character (same as \u000C).
\r Carriage return (CR) A carriage return character (same as \u000D).
\cletter Control code A control code character whose code unit value is the same as the remainder of dividing the code unit value of letter by 32. For example, \ca is the same as \u0001, \cb the same as \u0002, and so on.
\xhh ASCII character A character whose code unit value has an hex value equivalent to the two hex digits hh. For example, \x4c is the same as L, or \x23 the same as #.
\uhhhh Unicode character A character whose code unit value has an hex value equivalent to the four hex digits hhhh.
\0 Null A null character (same as \u0000).
\int Backreference The result of the submatch whose opening parenthesis is the int-th (int shall begin by a digit other than 0). See groups below for more info.
\d Digit A decimal digit character (same as [[:digit:]]).
\D Not digit Any character that is not a decimal digit character (same as [^[:digit]]).
\s Whitespace A whitespace character (same as [[:space:]]).
\S Not whitespace Any character that is not a whitespace character (same as [^[:space:]]).
\w Word An alphanumeric character (same as [[:alnum:]]).
\W Not word Any character that is not an alphanumeric character (same as [^[:alnum:]]).
\character Character The character character as it is, without interpreting its special meaning within a regex expression. Any character can be escaped except those which form any of the special character sequences above. Needed for: ^ $ \ . * + ? ( ) [ ] { } |
[class] Character class The target character is part of the class
[^class] Negated character class The target character is not part of the class

Quantifiers

Quantifiers follow a character or a special pattern character. They can modify the amount of times that character is repeated in the match:

Characters Times Effects
* 0 or more The preceding atom is matched 0 or more times.
+ 1 or more The preceding atom is matched 1 or more times.
? 0 or 1 The preceding atom is optional (matched either 0 times or once).
{int} int The preceding atom is matched exactly int times.
{int,} int or more The preceding atom is matched int or more times.
{min,max} Between min and max The preceding atom is matched at least min times, but not more than max.

By default, all these quantifiers are greedy (that is, they take as many characters that meet the condition as possible). This behavior can be overridden to ungreedy (that is, take as few characters that meet the condition as possible) by adding a question mark (?) after the quantifier.

Assertions

Assertions are conditions that do not consume characters in the target sequence: they do not describe a character, but a condition that must be fulfilled before or after a character.

Characters Description Condition for match
^ Beginning of line Either it is the beginning of the target sequence, or follows a line terminator.
$ End of line Either it is the end of the target sequence, or precedes a line terminator.
\b Word boundary The previous character is a word character and the next is a non-word character (or vice-versa).
The beginning and the end of the target sequence are considered here as non-word characters.
\B Not a word boundary The previous and next characters are both word characters or both are non-word characters.
The beginning and the end of the target sequence are considered here as non-word characters.
(?=subpattern) Positive lookahead The characters following the assertion must match subpattern, but no characters are consumed.
(?!subpattern) Negative lookahead The characters following the assertion must not match subpattern, but no characters are consumed.

A pattern can include different alternatives

Characters Description Effects
| Separator Separates two alternative patterns or subpatterns.