Make better readable than regex (i esper)
The idea is to make a regular expression language that is more readable than regex. The code need only to translate the requiem code to regex.
With rules below, here is exemples:
phone number in France:
+33\d\d\d\d\d\d\d\d\d
or +33\d{9}
+33|9(digit)|
phone number in any country:
+\d{10}
+|10(digit)|
email:
[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4}
|or(words, under(._%+-))||+|@|or(word, under(.-))||+|.|2-4(letter)|
or
|*(or(words, under(._%+-)))|@|*(word)|.|2-4(letter)|
- Requiem
- Summary
- RULES
- One digit from 0 to 9 (\d)
- One ASCII letter (a-z) (A-Z) (0-9) (\w)
- Whitespace character (\s)
- Every ASCII except 0-9 (\D)
- Every not word character (a-z, A-Z, 0-9) (\W)
- a non-whitespace character (\S)
- One or more (+)
- Test exactly X time a expression E
- Test between X and Y time a expression E
- Test 0 or more time a expression E
- Test 0 or 1 time a expression E
- Any character except line break (.)
- Special character
- X or Y (|)
- Capture group (parentheses)
- Content of group X (\1 \2 \3 ...)
- Non capture group (?:)
- A character in the brackets
- A letter in the range of two characters X and Y
- under with expression
- A letter than is not in the brackets
- A letter than is not in the range of two characters X and Y
- notunder with expression
- Start of the string (^)
- End of the string ($)
- Start of the line (\A)
- End of the line (\Z)
- Start of the word (\b)
- End of the string (\b)
- Start of the string or line (\G)
launch a rule with a |rules|
for just bar it's ||
digit
word
space
letter
nonword
nonspace
+
X(E)
X-Y(E)
*(E)
?(E)
any
just write it except |
that is ||
or(X, Y)
group(*E)
getgroup(X)
nogroup(*E)
under(AEIOU)
under(X-Y)
underExp(*E)
notunder(AEIOU)
notunder(X-Y)
notunderExp(*E)
start
end
startline
endline
startword
endword(X)
### Not the end of the string (\B)
notendword
startstring
If you want adding a rule, just make an issue or a pull request from a fork