"Higher-order" speaks to the complexity of the data and of the expression. While classic regex matches sequences of characters, HighRegex supports sequences of any type, for example a token in text processing or a datapoint.
In Functional Programming, "Higher Order Functions" are functions that take other functions as a parameter and/or return functions. Similarly, higher order regular expressions are composed of other regular expressions.
- Classes of arbitrary data types. Matches exactly 1 position. (See IClass)
- Expressions of arbitrary data types. Matches exactly 0 or more positions. (See IExpression)
- Custom classes
[left straight]
,[^right]
(See SetClass and NotClass) - Any class
.
(See AnyClass) - Repetition
straight+
, 'left{3,6}' (See GreedyRepeatExpression) - Optionals and lazy repetition
straight?
,straight??
, 'left{3,6}?' (See RepeatExpression) - Alternation
left|right
, `right|straight' (See AlternationExpression) - Sequential
a b c
,right left
(See ListExpression) - Atomic anchors begining
^
and end$
of sequence. (See StartExpression and EndExpression) - Atomic look ahead
(?=left)
and negatives (?!right)`. (See LookAheadExpression and NegativeLookAheadExpression) - Atomic look behind
(?<=left)
and negatives(?<!right)
. (See LookBackExpression and NegativeLookBackExpression)
- Grouping and back-referencing
(?<lastName>name)
- Grouping would be a great feature... Hopefully that is coming soon.
- Back-referencing
\1
would be a nice feature, but may add complexity. Ideally, if breaking changes are necessary to support this, it would be opt-in.
- Modifiers
(?i)
, etc.- Since you have complete control over the classes, modifiers are probably not necessary.
- Atomic Grouping
(?>right|left)
is not supported.- This would be a nice feature, and probably not too difficult.
- Continuing matches
\G
is not supported.- Indirect support with IExpression.IsMatchAt
- Comments.
(?# not supported)
- you'll probably be using an expression composed of self-documenting, named expressions, making comments unnecessary
- Replacements.
Regex.Replace("ABC", "[AEIOU]", "o")
This would be an interesting way to do substitutions or perhaps delete "bad data" from a sequence.