Use case discussion - Many, many offers with eligiblity rules

Question

Use case discussion - Many, many offers with eligiblity rules

alok87 opened this issue a year ago · 7 comments

Use case
We have a use case in which an offer is created with some eligibility rules for the customer. These offers are dynamically created by our users for various customers of our users dynamically from a dashboard and managed from a dashboard. Offers could be billions.

How can we use this rule engine with changing eligibility rules and offers getting created in realtime. Do we need to keep restarting the service to load all the new configurations. Won't loading all the rules in memory every time a bad idea at boot as offers can be millions also?

What are you suggestions here?

Benchmarks of grule

https://github.com/hyperjumptech/grule-rule-engine#benchmark

Answer 1 · 2023-05-30T20:58:45.000Z

@alok87 Cmmiw but you want dynamic rule updates in real-time at scale.

Have you tried using the RemoveRuleEntry and AddRuleEntry functions in knowledgeBase based on rule name?

You don't need to restart the instance.

Another approach can be to create a new instance of knowledgeBase when rule updates happen. You can use a cache or create a pool of knowledgeBase that can be created in a separate goroutine to further improve latency. (basically precomputing the knowledgeBase instances and using from the pool as rule execution requests come in). Be mindful of the pool size as memory issues might pop in if the rule file is big. So when you receive updates on rules, you refresh the pool with new knowledgeBase instances.

I haven't used it myself but saving rules to DB and creating a wrapper on top of it might make more sense given you have a high-scale use case.
Ref: https://github.com/hyperjumptech/grule-rule-engine/blob/master/docs/en/FAQ_en.md#2-saving-rule-entry-to-database

Similar question (for drools - java based) - https://groups.google.com/g/drools-usage/c/ZDygAFCaqiM

Answer 2 · 2023-05-31T06:22:51.000Z

I tired the example from your blog. It has one rule. It is 5ms for loading just one rule.

$ go run main.go
I0531 11:51:19.775975   33585 main.go:61] started program
I0531 11:51:19.781482   33585 main.go:66] loaded rule engine: 5.188709ms
I0531 11:51:19.781806   33585 main.go:86] evaluated rules: 259.709µs

So 99ms for 100 rules looks right. (as stated in benchmark README)

Why does it take so much time to load? How can we optimize the load?

I think creating a blank engine and adding rules(as you said above) as a knowledge base to that engine should be different tasks. Will this help in reducing 5ms?

Answer 3 · 2023-05-31T08:27:56.000Z

I debugged more, out of 5ms, almost all the time (~4ms) goes in walking the AST

ntlr.ParseTreeWalkerDefault.Walk(listener, psr.Grl())

can this be optimized? or avoided?

Answer 4 · 2023-05-31T12:10:32.000Z

So, there is a difference between loading rules vs executing rules.

Major time is taken in clone() from what I'm aware. That's why I mentioned leveraging caching/pooling to precompute knowledgeBase instances. I would recommend you to learn more on how RETE algo works to understand this better.
https://github.com/hyperjumptech/grule-rule-engine/blob/master/docs/en/RETE_en.md

Btw I use this in production and evaluation just takes 1-2ms. Loading on the other hand does take time. But given the scale you're mentioning keep an eye on memory usage on instances.

Answer 5 · 2023-05-31T15:05:53.000Z

Ack, would read the library and the algorithm in detail.

How about this? This should work out at big scale also 🤔

Keep the knowledge base updated in Redis with change event on the rules. (always a ~5ms, would not go till 100ms as it does currently if it is huge)
When the evaluate request comes, we first fetch the knowledge base from Redis and then peform the evaluation on that data.

Answer 6 · 2023-06-01T21:38:15.000Z

Thanks @mkfeuhrer

@newm4n What are your thoughts on this? Should we have a cache support built in at pod level also first? Then, support to have a central cache store?

Answer 7 · 2023-07-12T02:35:46.000Z

BenchmarkGovaluate   	  531122	      0.002 ms/op
BenchmarkGrule       	    7118	      0.160 ms/op

Ran benchmarks with govaluate, huge difference. Really need to solve this.