Panic on workingMemory when remove a rule
vitbaq opened this issue · 18 comments
Describe the bug
Hey guys, I'm working with the grule engine and I came across the following situation.
I created a microservice that uses grule to analyze data over rules. In this microservice I have an API that allows adding and deleting a rule. But the service is crashing when I remove an existing rule and add a new rule. So, I created a test application only using grule to analyze the problem and I came across a panic:
DEBU[0000] Cloning working memory POCRules:0.0.1 lib=grule-rule-engine package=ast DEBU[0000] Cloning 7 expressionSnapshotMap entries lib=grule-rule-engine package=ast panic: expression BMF.Value1 is not on the clone table
Apparently the problem is when removing the rule just on the ruleEntries but the workingMemory remains the same.
I will index the sample code that reproduce this error.
So my question is,
Am I using rule removal correctly? Or is this a bug in the removal?
To Reproduce
Code: https://gist.github.com/vitbaq/35f52f30bdfb442eb7a7a892ddb1d5d3
Log: https://gist.github.com/vitbaq/88a6f00edcadbb8724218a17551d113f
Testing environment:
- Operating System/Platform: Ubuntu 18.04.5 LTS (Bionic Beaver)
- Go version: go1.13.5 linux/amd64
- Grule-rule-engine version: 1.9.0
By simply reading your issue, I can immediately confirm this bug.
The working memory are tightly related to the AST to ensure there are no duplicated expression and expression atoms.
Removing a rule means that some expression and expression atoms in the working memory should be restructured.
Its part of the RETE logic built into Grule.
Removing a RuleEntry from KnowledgeBase sounds trivial but its quite complex underneath to do it properly.
A work around is to rebuild knowledge base each time there's a rule entry removal. I need to think about this and carefully
make it work without Breaking the working memory mechanism on cloning KnowledgeBase.
Thank you for pointing this out.
Thanks for the answer. I will do what was recommended. I will also study more of the library code to be able to help with the development and propose a solution for the bug.
Please keep me in touch if you find a solution to this challenge. Thanks.
Hi @newm4n, did you get a chance to look into this ? we have similar situation in my application like we need to remove a rule entry from the existing knowledgeBase because rebuilding knowledgebase with all the rules from scratch is expensive.
Let me know if you have any algorithm or idea in your mind, i can contribute or help you on this
Hello @vitbaq, I have a workaround(may be you can call it as a hack), reset the working memory and index it again while removing the rule entry, I have modified your example for better understanding.
// Remove the added rule (Rule1)
knowledgeBase.RemoveRuleEntry("Rule1")
//Workaround starts
delete(knowledgeLibrary.Library["POCRules:0.0.1"].RuleEntries, "Rule1")
knowledgeBase = knowledgeLibrary.GetKnowledgeBase("POCRules", "0.0.1")
knowledgeBase.WorkingMemory = ast.NewWorkingMemory("POCRules", "0.0.1")
knowledgeBase.WorkingMemory.IndexVariables()
//Workaround ends
Reason for that panic is even after removing ruleentry the working memory still contains the old data (expressions, variables etc) and this will create a problem while cloning, so i am recreating the working memory again to reflect the new changes. @newm4n, @niallnsec correct me if this leads to any problem
Full Example:
package examples
import (
"fmt"
"github.com/stretchr/testify/assert"
"testing"
"github.com/hyperjumptech/grule-rule-engine/ast"
"github.com/hyperjumptech/grule-rule-engine/builder"
"github.com/hyperjumptech/grule-rule-engine/engine"
"github.com/hyperjumptech/grule-rule-engine/pkg"
)
type POCfact struct {
Value1 int64
Value2 int64
Value3 int64
Value4 int64
}
const rule1 = `
rule Rule1 "Check the value1 value" {
when
BMF.Value1 > 100
then
Retract("Rule1");
}`
const ruleToAdd = `
rule Rule2 "Check the value2 value" {
when
BMF.Value2 > 100
then
Retract("Rule2");
}
`
func TestRuleAdd(t *testing.T) {
bmFact := &POCfact{
Value1: 1350,
Value2: 140,
}
// Create New KnowledgeLibrary
knowledgeLibrary := ast.NewKnowledgeLibrary()
ruleBuilder := builder.NewRuleBuilder(knowledgeLibrary)
dataCtx := ast.NewDataContext()
err := dataCtx.Add("BMF", bmFact)
if err != nil {
panic(err)
}
//Added a new rule
err = ruleBuilder.BuildRuleFromResource("POCRules", "0.0.1", pkg.NewBytesResource([]byte(rule1)))
// Get knowledgeBase and Engine to run FetchMatchingRules
knowledgeBase := knowledgeLibrary.NewKnowledgeBaseInstance("POCRules", "0.0.1")
assert.Equal(t, len(knowledgeBase.RuleEntries), 1)
// Remove the added rule (Rule1)
knowledgeBase.RemoveRuleEntry("Rule1")
//Workaround starts
delete(knowledgeLibrary.Library["POCRules:0.0.1"].RuleEntries, "Rule1")
knowledgeBase = knowledgeLibrary.GetKnowledgeBase("POCRules", "0.0.1")
knowledgeBase.WorkingMemory = ast.NewWorkingMemory("POCRules", "0.0.1")
knowledgeBase.WorkingMemory.IndexVariables()
//Workaround ends
assert.Equal(t, len(knowledgeBase.RuleEntries), 0)
// Adding a new rule (Rule2)
err = ruleBuilder.BuildRuleFromResource("POCRules", "0.0.1", pkg.NewBytesResource([]byte(ruleToAdd)))
knowledgeBase = knowledgeLibrary.NewKnowledgeBaseInstance("POCRules", "0.0.1")
assert.Equal(t, len(knowledgeBase.RuleEntries), 1)
engine := engine.NewGruleEngine()
rulesEntries, err := engine.FetchMatchingRules(dataCtx, knowledgeBase)
if err != nil {
panic(err)
}
for _, rl := range rulesEntries {
fmt.Printf("rules entries: %s\n", rl.RuleName)
}
}
let me know if this doesn't work you
@newm4n can you please check, is this a right way to handle this ? Sorry for asking, its a bit priority thing for me
@jinagamvasubabu, I'm very well aware of this issue.
AFAIK, due to the way RETE is implemented in Grule, where no AST node get duplicated (for speed reason).
Once a Rule get constructed into the Working Memory, its hard to remove any of its expression component from the working memory. At least to my understanding on how it works.
You should read my explanation here https://github.com/hyperjumptech/grule-rule-engine/blob/master/docs/RETE_en.md#what-is-inside-grules-working-memory
"First, it will try its best to make sure that none of the AST (Abstract Syntax Tree) nodes are duplicated."
The working memory stores the linkage of every rule entry's Expression/Expression Atoms the the actual expression stored. ITs to avoid duplication. Removing a rule entry means that we should remove the corresponding Expression/ExpressionAtom from the working memory without causing side effect in case of that Expression/Expression Atom got removed.
Imagine that we have 2 similar rule
rule One {
when A==1
then ...
}
rule Two {
when A== 2
then ...
}
You see the expression atom of variable "A" used in two rule. One and Two.
The working memory who hold the expression atom "A" holds the actual value.
And variable "A" exist in the two rule.
If we remove the rule One, all expression and expression atom should be removed as long as its not used by other rule.
This is the problem, How can we remove the expression atom "A" and expression "A==1" in rule One, without loosing reference of expression atom "A" and expression "A==2" in rule Two.
Its only a simple expression. Imagine if a rule like
Rule One
when
A.b.c == C.d.e.ToUpper() + (( E.f.DoSomething(G.h + J.k) * L.m.n / 0x232.32)
Rule Two
when
A.b.d == C.d.e.ToUpper() + 23
Rule Three
when
E.f.DoSomething(2 + Gh) < ParseInt(C.d.e)
So many expression and expression atom we need to track. Few of them being used on multiple rule entry.
I don't know if removing the RuleEntry from the RuleEntry array would be just fine without side effect.
ok, thanks for explaining, totally makes sense. what if reconstruct the working memory again after removing the ruleentry.
// Remove the added rule (Rule1)
knowledgeBase.RemoveRuleEntry("Rule1")
//Workaround starts
delete(knowledgeLibrary.Library["POCRules:0.0.1"].RuleEntries, "Rule1")
knowledgeBase = knowledgeLibrary.GetKnowledgeBase("POCRules", "0.0.1")
knowledgeBase.WorkingMemory = ast.NewWorkingMemory("POCRules", "0.0.1")
knowledgeBase.WorkingMemory.IndexVariables()
//Workaround ends
@newm4n , if the above doesn't work, can you tell me is there any way i can recreate the working memory from scratch by any chance ?
@jinagamvasubabu The more I'm thinking about it. The worst complexity developed on my mind. However. I might have a work around.
Instead of removing the RuleEntry from RuleEntries, We just flag the RuleEntry as "Deleted". Add a member variable Deleted bool
in RuleEntry with default of false
. As long as this flag is false
this RuleEntry will aways get evaluated. If we decide to remove this rule, we rename the RuleEntry into something bogus... like Deleted_j2h3k24h2kj3h4k2j432hk
. and change the Deleted
flag to True
.
This requires minimum amount of change.
- Change the GruleEngine.go to always ignore a RuleEntry if the flag
Deleted
istrue
. - Modify the
RuleEntry.go
, Add a member variableDeleted bool
, it will automatically be default tofalse
. - Modify the
KnowledgeBase.go
, change the functionRemoveRuleEntry(name)
not to remove theRuleEntry
fromRuleEntries
, but instead, change theDeleted
flag of the corresponding RuleEntry toTrue
. And rename the rule entries map key to some bogus name.
This could work and sounds like a plan. 🤞
Oh and one more....
- Make sure the
Clone
function inRuleEntry.go
to also clone thisDeleted
flag.
Thanks for this, i will quickly implement this and raise a pr.. fingers crossed
@newm4n , one more thing can i add a check to not to add already deleted rule entries in AddRuleEntry
when we are loading from GRB file (binary snapshot)
// AddRuleEntry add ruleentry into this knowledge base.
// return an error if a rule entry with the same name already exist in this knowledge base.
func (e *KnowledgeBase) AddRuleEntry(entry *RuleEntry) error {
e.lock.Lock()
defer e.lock.Unlock()
if entry.Deleted {
return nil
}
if e.ContainsRuleEntry(entry.RuleName) {
return fmt.Errorf("rule entry %s already exist", entry.RuleName)
}
e.RuleEntries[entry.RuleName] = entry
return nil
}
@newm4n , one more thing can i add a check to not to add already deleted rule entries in
AddRuleEntry
when we are loading from GRB file (binary snapshot)// AddRuleEntry add ruleentry into this knowledge base. // return an error if a rule entry with the same name already exist in this knowledge base. func (e *KnowledgeBase) AddRuleEntry(entry *RuleEntry) error { e.lock.Lock() defer e.lock.Unlock() if entry.Deleted { return nil } if e.ContainsRuleEntry(entry.RuleName) { return fmt.Errorf("rule entry %s already exist", entry.RuleName) } e.RuleEntries[entry.RuleName] = entry return nil }
No I don't think that would be necessary. This method only called by the ANTLR Listener once done parsing a GRL.
@newm4n I think i am missing something or not,
I have made the changes but looks like i am getting the deleted rule when i try to load again, I have made a draft PR for this change #252. let me know if i am missing anything
I suspect we marked delete in knowledgebase but knowledgeLibrary still pointing to the old rules
I have added a similar method RemoveRuleEntry
for KnowledgeLibrary as well, Not sure this is correct. Test is passing.
func (lib *KnowledgeLibrary) RemoveRuleEntry(ruleName, name string, version string) {
_, ok := lib.Library[fmt.Sprintf("%s:%s", name, version)]
if ok {
ruleEntry, ok := lib.Library[fmt.Sprintf("%s:%s", name, version)].RuleEntries[ruleName]
if ok {
lib.Library[fmt.Sprintf("%s:%s", name, version)].RuleEntries[ruleName].RuleName = fmt.Sprintf("Deleted_%s", ruleEntry.RuleName)
lib.Library[fmt.Sprintf("%s:%s", name, version)].RuleEntries[ruleName].Deleted = true
delete(lib.Library[fmt.Sprintf("%s:%s", name, version)].RuleEntries, ruleName)
lib.Library[fmt.Sprintf("%s:%s", name, version)].RuleEntries[ruleEntry.RuleName] = ruleEntry
}
}
}
Ah, yes. The KnowledgeBase holds by the KnowledgeLibrary is the actual one.
KnowledgeBase in the Library acts like a blue print, get copied every time we get a new KnowledgeBase instance from the Library.
Changes has been merged and its in tag v1.10.2
. @vitbaq can you try and let us know if its not working