PaesslerAG/gval

getting list of parsed variables before expression is evaluated for validation

aldas opened this issue · 6 comments

aldas commented

I have expression vtg_ground_speed_unit > 0 ? vtg_ground_speed_knots : vtg_ground_speed_kph.

These variables (vtg_ground_speed_unit, vtg_ground_speed_knots, vtg_ground_speed_kph) can only be from fixed list (coming from configuration file). So it would be nice if would be a way to get all variables from parsed expression so they could be validated against current configuration - to see if they exist or not.

So far I have tried to create selector that captures all seen variables in "test evaluation".

Something like that:

type variableCaptor struct {
	variables []string
}

func (c *variableCaptor) SelectGVal(_ context.Context, variable string) (interface{}, error) {
	c.captureVariable(variable)
	return 1.0, nil
}

func (c *variableCaptor) captureVariable(variable string) {
	for _, v := range c.variables {
		if variable == v {
			return
		}
	}
	c.variables = append(c.variables, variable)
}

func TestCaptureVariables(t *testing.T) {
	var testCases = []struct {
		name       string
		expression string
		expect     []string
	}{
		{
			name:       "capture 3 variables from IF",
			expression: "vtg_ground_speed_unit > 0 ? vtg_ground_speed_knots : vtg_ground_speed_kph",
			expect:     []string{"vtg_ground_speed_unit", "vtg_ground_speed_knots", "vtg_ground_speed_kph"},
		},
	}

	for _, tc := range testCases {
		t.Run(tc.name, func(t *testing.T) {
			eval, err := compile(tc.expression, gval.Full())
			if !assert.NoError(t, err) {
				return
			}
			captor := &variableCaptor{variables: make([]string, 0)}

			_, err = eval.EvalBool(context.Background(), captor)
			if !assert.NoError(t, err) {
				return
			}
			assert.Equal(t, tc.expect, captor.variables)
		})
	}
}

but this has limitation as vtg_ground_speed_kph will not get captured as executing all paths from expression involves knowing how to provides values for getting true/false sides.

Is there a way to traverse parsed tree without evaluation and pick all parts that are used as "variable"?

aldas commented

So far what I have come up is to have 2 extra functions in DSL that must be used for those variables in expression that come from configuration.

  • v(<timeserieCode>) is equivalent just plain old parameter getter
  • n(<timeserieCode>) is for cases when you need to use is as argument for some other function

and then we can do quite primitive regexp capture to get list of all referenced configuration variables.

this does not feel eloquent but seems to work

var timeserieRegexp = regexp.MustCompile(`(?:^|\W)[vn]\(([a-zA-Z0-9-_]{0,100})\)`)

func TestVariableSelector(t *testing.T) {
	expression := `avg(n(me1_rpm)) + v(me2_rpm) + 0.5`

	// ------------------ related timeserie capture start ---------------------
	result := make([]string, 0)
	matches := timeserieRegexp.FindAllStringSubmatch(expression, -1)
	for _, match := range matches {
		tmp := match[1]
		if !containsString(result, tmp) {
			result = append(result, tmp)
		}
	}
	assert.Equal(t, []string{"me1_rpm", "me2_rpm"}, result)
	// ------------------ related timeserie capture end ---------------------

	value, err := gval.Evaluate(expression,
		nil,
		gval.VariableSelector(func(path gval.Evaluables) gval.Evaluable {
			return func(c context.Context, v interface{}) (interface{}, error) {
				keys, err := path.EvalStrings(c, v)
				if err != nil {
					return nil, err
				}
				return keys[0], nil
			}
		}),
		// `avg(<timeserie>)` calculates average for timeserie from cache/database
		gval.Function("avg", func(timeserieCode string) (interface{}, error) {
			if timeserieCode == "me1_rpm" { // just for testing
				return 5.0, nil
			}
			return 0.0, nil
		}),
		// `v(<timeserie>)` helps to identify all related timeseries used by their value
		gval.Function("v", func(timeserieCode string) (interface{}, error) {
			return 1.0, nil
		}),
		// `n(<timeserie>)` helps to identify all related timeseries used by their code
		gval.Function("n", func(timeserieCode string) (interface{}, error) {
			return timeserieCode, nil
		}),
	)
	assert.NoError(t, err)
	assert.Equal(t, 6.5, value)
}

func containsString(haystack []string, needle string) bool {
	for _, v := range haystack {
		if v == needle {
			return true
		}
	}
	return false
}

@aldas
Maybe you can use init language to do what you want to do
https://pkg.go.dev/github.com/PaesslerAG/gval#Init

You would probably need to reimplement / copy the ident extension inside gval and pass the variable name through the context of the parser. Maybe by storing an AddVariable closure inside the context. vophan1ee @.> schrieb am Di., 15. Feb. 2022, 10:23:

@aldas https://github.com/aldas Maybe you can use init language to do what you want to do https://pkg.go.dev/github.com/PaesslerAG/gval#Init — Reply to this email directly, view it on GitHub <#75 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHS6ZHVATMDJ6JS3TVP2ZY3U3ILQLANCNFSM5N2BGHHQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub. You are receiving this because you are subscribed to this thread.Message ID: @.
>

From reply of @aldas , i think he just want to get all the tokens of his expression.
So can he just use init langueage and scan all tokens by parser like:

	gval.NewLanguage(
		gval.Full(),
		gval.Init(func(ctx context.Context, parser *gval.Parser) (gval.Evaluable, error) {
			var tokens []string
			for {
				switch parser.Scan() {
				case scanner.EOF:
					break
				default:
					token := parser.TokenText()
					tokens = append(tokens, token)
				}
			}

			// do some check jobs
		}),
	)

I'm not sure if this can solve his question....

You would probably need to reimplement / copy the ident extension inside gval and pass the variable name through the context of the parser. Maybe by storing an AddVariable closure inside the context. vophan1ee @.> schrieb am Di., 15. Feb. 2022, 10:23:

@aldas https://github.com/aldas Maybe you can use init language to do what you want to do https://pkg.go.dev/github.com/PaesslerAG/gval#Init — Reply to this email directly, view it on GitHub <#75 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHS6ZHVATMDJ6JS3TVP2ZY3U3ILQLANCNFSM5N2BGHHQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub. You are receiving this because you are subscribed to this thread.Message ID: _
@**
.**_>

From reply of @aldas , i think he just want to get all the tokens of his expression. So can he just use init langueage and scan all tokens by parser like:

	gval.NewLanguage(
		gval.Full(),
		gval.Init(func(ctx context.Context, parser *gval.Parser) (gval.Evaluable, error) {
			var tokens []string
			for {
				switch parser.Scan() {
				case scanner.EOF:
					break
				default:
					token := parser.TokenText()
					tokens = append(tokens, token)
				}
			}

			// do some check jobs
		}),
	)

I'm not sure if this can solve his question....

I have a similar use case to check if the expression entered is correct or not for validation purposes. For example, if expression contains some function which is not defined in my language, I have to raise an error saying expression is incorrect without even evaluating the expression and same for variables (for eg. foo.bar.random).
I am not able to properly use this method. Please can you help here. @skyf0cker
Example expr = lpadding(user.id, 10, '0') where lpadding is a function which takes 3 args and user.id is a variable.
If I get output telling lpadding is the function used and user.id is the variable used.....I might be able to validate it with my config which contains set of functions and variables that can only be used in expressions.

Really hope there is a solution