A regular expression specifies a set of character strings. A member of this set of strings is said to be matched by the regular expression. Some characters have special meaning when used in a regular expression; other characters stand for themselves.

The regular expressions available for use with the regexp functions are constructed as follows:

ExpressionMeaning

c the character c where c is not a special character.
\c the character c where c is any character, except a digit in the range 1-9.
^ the beginning of the line being compared.
$ the end of the line being compared.
. any character in the input.
[s] any character in the set s, where s is a sequence of characters and/or a range of characters, for example, [c-c].
[^s] any character not in the set s, where s is defined as above.
r* zero or more successive occurrences of the regular expression r. The longest leftmost match is chosen.
rx the occurrence of regular expression r followed by the occurrence of regular expression x. (Concatenation)
r\{m,n\} any number of m through n successive occurrences of the regular expression r. The regular expression r\{m\} matches exactly m occurrences; r\{m,\} matches at least m occurrences.
\(r\) the regular expression r. When \n (where n is a number greater than zero) appears in a constructed regular expression, it stands for the regular expression x where x is the nth regular expression enclosed in \( and \) that appeared earlier in the constructed regular expression. For example, \(r\)x\(y\)z\2 is the concatenation of regular expressions rxyzy.

Characters that have special meaning except when they appear within square brackets ([]) or are preceded by \ are: ., *, [, \. Other special characters, such as $ have special meaning in more restricted contexts.

The character ^ at the beginning of an expression permits a successful match only immediately after a newline, and the character $ at the end of an expression requires a trailing newline.

Two characters have special meaning only when used within square brackets. The character - denotes a range, [c-c], unless it is just after the open bracket or before the closing bracket, [ -c] or [c-] in which case it has no special meaning. When used within brackets, the character ^ has the meaning complement of if it immediately follows the open bracket (example: [^c]); elsewhere between brackets (example: [c^]) it stands for the ordinary character ^.

The special meaning of the \ operator can be escaped only by preceding it with another \, for example \\.