Regular Expression Symbols
April 17, 2010
Leave a comment
Introduction:
“Regular expressions” are combination of special characters and symbols used for pattern matching. i.e., you specify a particular combination of such characters and symbols (= regular expression) and the compiler will search for that string of words through the text data. The following is a very short, and hopefully easy-to-follow, introduction to some of the most useful regular expressions.
1. Common matching symbols
Table 1.
| Regular Expression | Description |
| . | Matches any sign |
| ^regex | regex must match at the beginning of the line |
| regex$ | Finds regex must match at the end of the line |
| [abc] | Set definition, can match the letter a or b or c |
| [abc[vz]] | Set definition, can match a or b or c followed by either v or z |
| [^abc] | When a “^” appears as the first character inside [] when it negates the pattern. This can match any character except a or b or c |
| [a-d1-7] | Ranges, letter between a and d and figures from 1 to 7, will not match d1 |
| X|Z | Finds X or Z |
| XZ | Finds X directly followed by Z |
| $ | Checks if a line end follows |
2. Metacharacters
The following meta characters have a pre-defined meaning and make certain common pattern easier to use, e.g. \d instead of [0...9].
Table 2.
| Regular Expression | Description |
| \d | Any digit, short for [0-9] |
| \D | A non-digit, short for [^0-9] |
| \s | A whitespace character, short for [ \t\n\x0b\r\f] |
| \S | A non-whitespace character, for short for [^\s] |
| \w | A word character, short for [a-zA-Z_0-9] |
| \W | A non-word character [^\w] |
| \S+ | Several non-whitespace characters |
3. Quantifier
A quantifier defines how often an element can occur. The symbols ?, *, + and {} define the quantity of the regular expressions
Table 3.
| Regular Expression | Description | |
| * | Occurs zero or more times | |
| + | Occurs one or more times | |
| ? | Occurs no or one times, ? is short for {0,1} | |
| {X} | Occurs X number of times, {} describes the order of the preceding liberal | |
| {X,Y} | .Occurs between X and Y times, | |
| *? | ? after a qualifier makes it a “reluctant quantifier”, it tries to find the smallest match. |
Categories: 1

Recent Comments