Java Regular Expressions

What is a Regular Expression?

A Regular Expression (regex or regexp) is a powerful sequence of characters that defines a search pattern. It is used for matching and manipulating strings, and it provides a concise and flexible way to describe text patterns. Regular expressions are widely used in programming, text processing, and data validation tasks.

Java does not have a built-in Regular Expression class, but we can import the java.util.regex package to work with regular expressions. The package includes the following classes:

Pattern Class - Defines a pattern (to be used in a search)
Matcher Class - Used to search for the pattern
PatternSyntaxException Class - Indicates syntax error in a regular expression pattern

Example

    
 import java.util.regex.Matcher;
 import java.util.regex.Pattern;

 public class EmailValidator {

     public static void main( String[] args) {
         // Example email addresses
         String email1 =  "user@example.com";
         String email2 =  "invalid.email";
         String email3 =  "another_user@domain";

         // Regular expression for a simple email validation
         String regex =  "^[a-zA-Z0-9_]+@[a-zA-Z0-9]+\\.[a-zA-Z]{2,}$";

         // Compile the regular expression
         Pattern pattern = Pattern.compile( regex);

         // Create Matcher objects
         Matcher matcher1 =  pattern.matcher(email1);
         Matcher matcher2 =  pattern.matcher(email2);
         Matcher matcher3 =  pattern.matcher(email3);

         // Perform matching and print results
         System.out.println( "Email 1 is valid: " + matcher1.matches());
         System.out.println( "Email 2 is valid: " + matcher2.matches());
        System.out.println( "Email 3 is valid: " + matcher3.matches());
    }
}

Regular Expression Syntax

Here is the table listing down all the regular expression metacharacter syntax available in Java −

Subexpression	Matches
^	Matches the beginning of the line.
$	Matches the end of the line.
.	Matches any single character except newline. Using m option allows it to match the newline as well.
[...]	Matches any single character in brackets.
[^...]	Matches any single character not in brackets.
\A	Beginning of the entire string.
\z	End of the entire string.
\Z	End of the entire string except allowable final line terminator.
re*	Matches 0 or more occurrences of the preceding expression.
re+	Matches 1 or more of the previous thing.
re?	Matches 0 or 1 occurrence of the preceding expression.
re{ n}	Matches exactly n number of occurrences of the preceding expression.
re{ n,}	Matches n or more occurrences of the preceding expression.
re{ n, m}	Matches at least n and at most m occurrences of the preceding expression.
a\| b	Matches either a or b.
(re)	Groups regular expressions and remembers the matched text.
(?: re)	Groups regular expressions without remembering the matched text.
(?> re)	Matches the independent pattern without backtracking.
\w	Matches the word characters.
\W	Matches the nonword characters.
\s	Matches the whitespace. Equivalent to [\t\n\r\f].
\S	Matches the nonwhitespace.
\d	Matches the digits. Equivalent to [0-9].
\D	Matches the nondigits.
\A	Matches the beginning of the string.
\Z	Matches the end of the string. If a newline exists, it matches just before newline.
\z	Matches the end of the string.
\G	Matches the point where the last match finished.
\n	Back-reference to capture group number "n".