Regular Expressions

The table below lists the regular expressions that can be used in a field of the Regular expression type. See Field Properties dialog box.

Item name Conventional regular expression symbol Usage examples and explanations
Any character . c.t denotes "cat," "cot," etc.
Character from group []

[b-d]ell denotes "bell," "cell," "dell."

[ty]ell denotes "tell" and "yell."

Character not from group [^]

[^y]ell denotes "dell," "cell," "tell," but forbids "yell."

[^n-s]ell denotes "bell," "cell," but forbids "nell," "oell," "pell," "qell," "rell," and "sell."

Or | c(a|u)t denotes "cat" and "cut."
0 or more matches * 10* denotes numbers 1, 10, 100, 1000, etc.
1 or more matches + 10+ allows numbers 10, 100, 1000, etc., but forbids 1.
Letter or digit [0-9a-zA-Zа-яА-Я]

[0-9a-zA-Zа-яА-Я] allows any single character.

[0-9a-zA-Zа-яА-Я]+ allows any word.

Capital Latin letter [A-Z]
Small Latin letter [a-z]
Capital Cyrillic letter [А-Я]
Small Cyrillic letter [а-я]
Digit [0-9]
Space \s
@ Reserved.

Note. To use a regular expression symbol as a normal character, precede it with a backslash. For example, [t-v]x+ stands for tx, txx, txxx, ..., ux, uxx, etc., but \[t-v\]x+ stands for [t-v]x, [t-v]xx, [t-v]xxx, etc.

Note. To group regular expression elements, use brackets. For example, (a|b)+|c stands for c or any combinations like abbbaaabbb, ababab, etc. (a word of any non-zero length in which there may be any number of a's and b's in any order), while a|b+|c stands for a, c, and b, bb, bbb, etc.

Examples

Regular expression for dates

The number denoting a day may consist of one digit (1, 2, etc.) or two digits (02, 12), but it cannot be zero (00 or 0). The regular expression for the day should then look like this: ((|0)[1-9])|([1|2][0-9])|(30)|(31).

The regular expression for the month should look like this: ((|0)[1-9])|(10)|(11)|(12).

The regular expression for the year should look like this: (((19)|(20))[0-9][0-9])|([0-9][0-9]).

What is left is to combine all this together and separate the numbers by period (like 1.03.1999). The period is a regular expression symbol, so you must put a backslash (\) before it. The regular expression for the full date should then look like this:

(((|0)[1-9])|([1|2][0-9])|(30)|(31))\.(((|0)[1-9])|(10)|(11)|(12))\.((((19)|(20))[0-9][0-9])|([0-9][0-9])).

Regular expression for e-mail addresses

[a-zA-Z0-9_\-\.]+\@[a-zA-Z0-9\.\-]+\.[a-zA-Z]+

See also

Field Properties Dialog Box

26.03.2024 13:49:49

Please leave your feedback about this article

Usage of Cookies. In order to optimize the website functionality and improve your online experience ABBYY uses cookies. You agree to the usage of cookies when you continue using this site. Further details can be found in our Privacy Notice.