Archive for the ‘1’ Category

Regular Expression Symbols

April 17, 2010 Leave a comment


“Regular expressions” are combination of special characters and symbols used for pattern matching. i.e., you specify a particular combination of such characters and symbols (= regular expression) and the compiler will search for that string of words through the text data. The following is a very short, and hopefully easy-to-follow, introduction to some of the most useful regular expressions.

1. Common matching symbols

Table 1.

Regular Expression Description
. Matches any sign
^regex regex must match at the beginning of the line
regex$ Finds regex must match at the end of the line
[abc] Set definition, can match the letter a or b or c
[abc[vz]] Set definition, can match a or b or c followed by either v or z
[^abc] When a “^” appears as the first character inside [] when it negates the pattern. This can match any character except a or b or c
[a-d1-7] Ranges, letter between a and d and figures from 1 to 7, will not match d1
X|Z Finds X or Z
XZ Finds X directly followed by Z
$ Checks if a line end follows

2. Metacharacters

The following meta characters have a pre-defined meaning and make certain common pattern easier to use, e.g. \d instead of [0…9].

Table 2.

Regular Expression Description
\d Any digit, short for [0-9]
\D A non-digit, short for [^0-9]
\s A whitespace character, short for [ \t\n\x0b\r\f]
\S A non-whitespace character, for short for [^\s]
\w A word character, short for [a-zA-Z_0-9]
\W A non-word character [^\w]
\S+ Several non-whitespace characters

3. Quantifier

A quantifier defines how often an element can occur. The symbols ?, *, + and {} define the quantity of the regular expressions

Table 3.

Regular Expression Description
* Occurs zero or more times
+ Occurs one or more times
? Occurs no or one times, ? is short for {0,1}
{X} Occurs X number of times, {} describes the order of the preceding liberal
{X,Y} .Occurs between X and Y times,
*? ? after a qualifier makes it a “reluctant quantifier”, it tries to find the smallest match.
Categories: 1