Regular expressions for Haskell.
licensePlate :: Text -> Maybe Text
licensePlate = match "[A-Z]{3}[0-9]{3,4}"
licensePlates :: Text -> [Text]
licensePlates = match "[A-Z]{3}[0-9]{3,4}"
case "The quick brown fox" of
[regex|\bbrown\s+(?<animal>[A-z]+)\b|] -> Text.putStrLn animal
_ -> error "nothing brown"
let kv'd = lined . packed . [_regex|(?x) # Extended PCRE2 syntax
^\s* # Ignore leading whitespace
([^=:\s].*?) # Capture the non-empty key
\s* # Ignore trailing whitespace
[=:] # Separator
\s* # Ignore leading whitespace
(.*?) # Capture the possibly-empty value
\s*$ # Ignore trailing whitespace
|]
forMOf kv'd file $ execStateT $ do
k <- gets $ capture @1
v <- gets $ capture @2
liftIO $ Text.putStrLn $ "found " <> k <> " set to " <> v
case myMap ^. at k of
Just v' | v /= v' -> do
liftIO $ Text.putStrLn $ "setting " <> k <> " to " <> v'
_capture @2 .= v'
_ -> liftIO $ Text.putStrLn "no change"
- No opaque "
Regex
" object. Instead, quiet functions with simple types—for the most part it'sText
(pattern)-> Text
(subject)-> result
. Use partial application to create performant, compile-once-match-many code. - No custom typeclasses.
- A single datatype for both compile and match options, the
Option
monoid. Text
everywhere.- Match success expressed via
Alternative
. - Opt-in Template Haskell facilities for compile-time verification of patterns, indexing captures, and memoizing inline regexes.
- Opt-in
lens
support. - No failure monads to express compile errors, preferring pure functions and
throwing imprecise exceptions with pretty
Show
instances. Write simple code and debug it. Or, don't, and use the Template Haskell features instead. Both are first-class. - Vast presentation of PCRE2 functionality. We can even register Haskell callbacks to run during matching!
- Zero-copying of substrings where beneficial. Benchmarks show a 10×
speedup over
pcre-light
, and 20× overregex-pcre
, for longer captures. - Few dependencies.
- Bundled, statically-linked UTF-16 build of up-to-date PCRE2 (version 10.39), with a complete, exposed Haskell binding.
- Many performance optimizations. Currently we are as much as 2–3×
slower than other libraries for some operations, although things are
improving. (We are already faster than
regex-base/regex-pcre
when working with
Text
, even without zero-copying.) If it's really regex processing that's causing a bottleneck, pcre-light/-heavy/lens-regex-pcre are recommended instead of this library for the very best performance. - Make use of DFA matching and JIT compilation.
- Improve PCRE2 C compile time.
- Add splitting support.
Apache 2.0.
PCRE2 is distributed under the 3-clause BSD license.
©2020–2022 Shlomo Shuck