Regular Expression Symbols
Introduction:
“Regular expressions” are combination of special characters and symbols used for pattern matching. i.e., you specify a particular combination of such characters and symbols (= regular expression) and the compiler will search for that string of words through the text data. The following is a very short, and hopefully easy-to-follow, introduction to some of the most useful regular expressions.
1. Common matching symbols
Table 1.
Regular Expression | Description |
. | Matches any sign |
^regex | regex must match at the beginning of the line |
regex$ | Finds regex must match at the end of the line |
[abc] | Set definition, can match the letter a or b or c |
[abc[vz]] | Set definition, can match a or b or c followed by either v or z |
[^abc] | When a “^” appears as the first character inside [] when it negates the pattern. This can match any character except a or b or c |
[a-d1-7] | Ranges, letter between a and d and figures from 1 to 7, will not match d1 |
X|Z | Finds X or Z |
XZ | Finds X directly followed by Z |
$ | Checks if a line end follows |
2. Metacharacters
The following meta characters have a pre-defined meaning and make certain common pattern easier to use, e.g. \d instead of [0…9].
Table 2.
Regular Expression | Description |
\d | Any digit, short for [0-9] |
\D | A non-digit, short for [^0-9] |
\s | A whitespace character, short for [ \t\n\x0b\r\f] |
\S | A non-whitespace character, for short for [^\s] |
\w | A word character, short for [a-zA-Z_0-9] |
\W | A non-word character [^\w] |
\S+ | Several non-whitespace characters |
3. Quantifier
A quantifier defines how often an element can occur. The symbols ?, *, + and {} define the quantity of the regular expressions
Table 3.
Regular Expression | Description | |
* | Occurs zero or more times | |
+ | Occurs one or more times | |
? | Occurs no or one times, ? is short for {0,1} | |
{X} | Occurs X number of times, {} describes the order of the preceding liberal | |
{X,Y} | .Occurs between X and Y times, | |
*? | ? after a qualifier makes it a “reluctant quantifier”, it tries to find the smallest match. |
Pattern class of Regular Expression in Java
Introduction:
This article makes you bit more knowledge in java regular expressions. For manage the regular expressions, the java have the three classes in java.util.regex package. But in this article we focused only Pattern class.
A regular expression is a pattern of characters that describes a set of strings. We use the regular expressions to find, display, or modify some or all of the occurrences of a pattern in an input sequence.
Java.util.regex classes:
The package java.util.regex contains three classes such as,
- Pattern Class
- Matcher Class
- PatternSyntaxExcpetion
Let us take a look at Pattern class.
Pattern Class:
A regular expression which is specified as a string that should be first compiled into an instance of Pattern
class. The resulting pattern can be used to create an instance of Matcher
class which contains various in-built methods that helps in performing a match against the regular expression. Many Matcher
objects can share the same Pattern
object.
Create Pattern using compile():
Pattern class doesn’t have a public constructor. So by using the static method compile we can create the pattern.
Pattern p = Pattern.compile("my regexp");
For Regular expression symbols,click here
Important Note:
The backslash is an escape character in Java Strings. i.e., backslash has a predefine meaning in Java. You have to use “\\” instead of “\”.
If you want to define “\w” then you must be using “\\w” in your regex like this.
Pattern r = Pattern.compile(“\\w+”); //Place your pattern here
“\w“ represents a word character, i.e., short for [a-zA-Z_0-9]
We can create the Pattern with flags.
Syntax:
Pattern pattern=Pattern.compile(regex,flags);
For ex, If we want to neglect the case sensitive we can achieve by using below one,
Pattern pattern = Pattern.compile(“\\w+”,Pattern.CASE_INSENSITIVE);
Validate pattern using matches():
The matches() method is used to check whether the given input is match with the pattern. This method returns true only if the entire input text matches the pattern.
boolean isMatch = Pattern.matches(“\\w+”,”Welcome to java world”);
Get the Pattern using pattern():
The pattern() method is used for find out the pattern of the given string. This method returns the regular expression as a string from which this pattern was compiled.
Pattern p=input.pattern();
Split input using split():
The split() method is used to split the given input text based on the given pattern. It returns a String array. There are two forms of split()
method,
- split(String input)
- split(String input, int limit)
In the second form, we have an argument called limit
which is used to specify the limit i.e. the number of resultant strings that have to be obtained by split()
method.
String[] str = pattern.split(input,3);
Sample code for Pattern class and methods:
MailID Validation.java:
import java.util.regex.Matcher; import java.util.regex.Pattern; /** * * @author Home */ class MailIDValidation { public static void main(String args[]) { //Input the string for validation String email = "mymail@gmail.com"; //Set the email pattern string Pattern p = Pattern.compile(".+@.+\\.[a-z]+"); //Match the given string with the pattern Matcher m = p.matcher(email); //check whether match is found boolean matchFound = m.matches(); if (matchFound) System.out.println("Valid Email Id."); else System.out.println("Invalid Email Id."); } }
Recent Comments