Regular Expression in Java

June 11, 2012

Quick Study of Regex:

Common Syntax to be remember:

Regular Expression	Description
`.`	Matches any sign
`^regex`	regex must match at the beginning of the line
`regex$`	Finds regex must match at the end of the line
`[abc]`	Set definition, can match the letter a or b or c
`[abc][vz]`	Set definition, can match a or b or c followed by either v or z
`[^abc]`	When a "^" appears as the first character inside [] when it negates the pattern. This can match any character except a or b or c
`[a-d1-7]`	Ranges, letter between a and d and figures from 1 to 7, will not match d1
`X\|Z`	Finds X or Z
`XZ`	Finds X directly followed by Z
`$`	Checks if a line end follows

Metacharacter:
e.g: use \d instead of [0-9]

Regular Expression	Description
`\d`	Any digit, short for [0-9]
`\D`	A non-digit, short for [^0-9]
`\s`	A whitespace character, short for [ \t\n\x0b\r\f]
`\S`	A non-whitespace character, for short for [^\s]
`\w`	A word character, short for [a-zA-Z_0-9]
`\W`	A non-word character [^\w]
`\S+`	Several non-whitespace characters

Quantifiers:
It defines how often an element can occurs.

Regular Expression	Description	Examples
`*`	Occurs zero or more times, is short for {0,}	X* - Finds no or several letter X, .* - any character sequence
`+`	Occurs one or more times, is short for {1,}	X+ - Finds one or several letter X
`?`	Occurs no or one times, ? is short for {0,1}	X? -Finds no or exactly one letter X
`{X}`	Occurs X number of times, {} describes the order of the preceding liberal	\d{3} - Three digits, .{10} - any character sequence of length 10
`{X,Y}`	Occurs between X and Y times,	\d{1,4}- \d must occur at least once and at a maximum of four
`*?`	? after a qualifier makes it a "reluctant quantifier", it tries to find the smallest match.

Grouping and Back reference:
Using (), one can group regular expressions. One can retrieve group values via $ i.e. one can refer to a group $1 is the first group, $2 the second, etc.
Lets for example assume you want to replace all whitespace between a letter followed by a point (dot) or a comma.

package com.Nur;

public class Testing {

    public static final String EXAMPLE_TEST = "This is my small example , full . nochange."
        + "string which I'm going to " + "use for pattern matching.";


    public static void main(String[] args) {
        String pattern = "(\\w)(\\s+)([\\.,])";
        System.out.println(EXAMPLE_TEST.replaceAll(pattern, "$3"));

    }

}

Output: This is my small exampl, ful. nochange.string which I'm going to use for pattern matching.

Search This Blog

Smritir Kone Ami

Regular Expression in Java

Comments

Popular posts from this blog

AI and Passive Income: Opportunities and Considerations

How to fix World-writable config file /etc/my.cnf is ignored?

Solution: "MySQL is running but PID file could not be found"