learn-regex: A repository from Rosc

Translations:

What is Regular Expression?

A regular expression is a group of characters or symbols which is used to find a specific pattern in a text.

A regular expression is a pattern that is matched against a subject string from left to right. Regular expressions are used to replace text within a string, validating forms, extracting a substring from a string based on a pattern match, and so much more. The term "regular expression" is a mouthful, so you will usually find the term abbreviated to "regex" or "regexp".

Imagine you are writing an application and you want to set the rules for when a user chooses their username. We want to allow the username to contain letters, numbers, underscores and hyphens. We also want to limit the number of characters in the username so it does not look ugly. We can use the following regular expression to validate the username:

The regular expression above can accept the strings john_doe, jo-hn_doe and john12_as. It does not match Jo because that string contains an uppercase letter and also it is too short.

Basic Matchers
Meta Characters
Shorthand Character Sets
Lookarounds
Flags
Greedy vs Lazy Matching

1. Basic Matchers

A regular expression is just a pattern of characters that we use to perform a search in a text. For example, the regular expression the means: the letter t, followed by the letter h, followed by the letter e.

"the" => The fat cat sat on the mat.

Meta character	Description
.	Period matches any single character except a line break.
[ ]	Character class. Matches any character contained between the square brackets.
[^ ]	Negated character class. Matches any character that is not contained between the square brackets
*	Matches 0 or more repetitions of the preceding symbol.
+	Matches 1 or more repetitions of the preceding symbol.
?	Makes the preceding symbol optional.
{n,m}	Braces. Matches at least "n" but not more than "m" repetitions of the preceding symbol.
(xyz)	Character group. Matches the characters xyz in that exact order.
\|	Alternation. Matches either the characters before or the characters after the symbol.
\	Escapes the next character. This allows you to match reserved characters `[ ] ( ) { } . * + ? ^ $ \ \|`
^	Matches the beginning of the input.
$	Matches the end of the input.

Shorthand	Description
.	Any character except new line
\w	Matches alphanumeric characters: `[a-zA-Z0-9_]`
\W	Matches non-alphanumeric characters: `[^\w]`
\d	Matches digits: `[0-9]`
\D	Matches non-digits: `[^\d]`
\s	Matches whitespace characters: `[\t\n\f\r\p{Z}]`
\S	Matches non-whitespace characters: `[^\s]`

Symbol	Description
?=	Positive Lookahead
?!	Negative Lookahead
?<=	Positive Lookbehind
?<!	Negative Lookbehind

Flag	Description
i	Case insensitive: Match will be case-insensitive.
g	Global Search: Match all instances, not just the first.
m	Multiline: Anchor meta characters work on each line.

Rosc/learn-regex

Translations:

What is Regular Expression?

Table of Contents

1. Basic Matchers

2. Meta Characters

2.1 The Full Stop

2.2 Character Sets

2.2.1 Negated Character Sets

2.3 Repetitions

2.3.1 The Star

2.3.2 The Plus

2.3.3 The Question Mark

2.4 Braces

2.5 Capturing Groups

2.5.1 Non-Capturing Groups

2.6 Alternation

2.7 Escaping Special Characters

2.8 Anchors

2.8.1 The Caret

2.8.2 The Dollar Sign

3. Shorthand Character Sets

4. Lookarounds

4.1 Positive Lookahead

4.2 Negative Lookahead

4.3 Positive Lookbehind

4.4 Negative Lookbehind

5. Flags

5.1 Case Insensitive

5.2 Global Search

5.3 Multiline

6. Greedy vs Lazy Matching

Contribution

License