potential performance issues
BurntSushi opened this issue · 4 comments
BurntSushi commented
I haven't done any benchmarks, but from reading the code, I see two potentially quite large performance problems.
- You're cloning the entire
RegexSet
on each call tohandle
. It's true that cloning aRegex
or aRegexSet
is in and of itself very cheap, but in doing so you will have lost any state built up from previous matching, which will impact performance. (One might argue that this is a problem withRegex
and that it should clone the internal state, but then your clone won't be cheap.) - In
handle
, the code is compiling aRegex
for every match. Regex compilation is not something you should think of as being fast, and in this case, will probably negate many of the perf benefits. Instead, I'd recommend compiling each regex added toRegexSet
as it is built, and then using the index you get back frommatches
to find the right regex to use.
gsquire commented
You're absolutely right, I will have to revisit that part of the code. I can add that state to the structure so lookups can be faster. Thanks for taking the time to point out some flaws!
gsquire commented
So I managed to triple requests per second after this commit with my simple wrk
benchmark. It was around 85K which I was impressed with. I always take those with a grain of salt, but that's much better than before! See here: 6db7c26
gsquire commented
Also, thank you for doing another CR. I really appreciate it.
BurntSushi commented
@gsquire Looks much better! Nice work. :-)