Professional Documents
Culture Documents
e
g
i
m
o
s
x
References
For more information on Perl regexps and other syntaxes
you can refer to OReillys book Mastering Regular
Expressions.
Examples:
The following sentence will be used in all our examples:
The ID sp:UBP5_RAT is similar to the rabit AC tr:Q12345
will match
Options
evaluate REPLACE as an expression
global matches (matches all occurrences)
case insensitive
multiline, allow ^ and $ to match with (\n)
compile MOTIF only once
single line, dot . matches new-line (\n)
ignore whitespace and allow comments # in MOTIF
will match:
case insensitive
multiline, allow ^ and $ to match with (\n)
compile MOTIF only once
single line, dot . matches new-line (\n)
ignore whitespace and allow comments # in MOTIF
Character classes
[...]
Match any one character of a class
[^...]
Match any one character not in the bracket
.
Match any character (except newline [^\n]) in non
single-line mode (/s)
\d
Any digit. Equivalent to [0..9] or [[:digit:]]
Any non-digit.
\D
Any whitespace. [ \t\s\n\r\f\v] or [[:space:]]
\s
Any non-whitespace.
\S
\w
Any word character. [a-zA-Z0-9_] or [[:alnum:_]]
\W
Special characters
\a
alert (bell)
\b
backspace
\e
escape
\f
form feed
\n
newline
\r
carriage return
\t
horizontal tabulation
\nnn
\xnn
\cX
octal nnn
hexadecimal nn
control character X
Repetitions
?
Zero or one occurrence of the previous item.
*
Zero or more occurrences of the previous item.
+
One or more occurrences of the previous item.
{n,m}
{n,}
{n}
{}?
Anchors
^ or \A
$ or \Z
\z
\b
\B
will match:
\n
$n
will match:
will match:
Options
c
d
s
complement REPLACELIST
delete non-replaced characters
single replace of duplicated characters
UniCode matches
Perl 5.8 supports UniCode 3.2. However it would be too
long to describe all the properties in details here. For more
information see Mastering Regular Expressions.
Text-span modiers
\Q
Quote following metacharacters until \E or end of
motif (allow the use of scalars in regexp)
\u
Force next character to uppercase
\l
Force next character to lowecase
\U
Force all following characters to uppercase
\L
Force all following characters to lowercase
\E
End a span started with \Q, \U or \L
Extended Regexp
(?#...)
Substring ... is a comment
(?=...)
Positive lookahead. Match if exists next match
(e.g., allow overlapping matches in global mode)
(?!...)
Negative lookahead. Match if no next match
(?<=...) Positive lookahead. Fixed length only.
(?<!...) Negative lookahead. Fixed length only.
(?imsx) Modify matching options
Transliteration: translate operator tr///
EXPR =~ tr/SEARCHLIST/REPLACELIST/cds
Transliteration is not - and does not use - a regular expression,
but it is frequently associated with the regexp in PERL. Thus
we decided to include it in this guide.