gobwas/glob

Some combinations of wildcards inside alternates fail

hackery opened this issue · 2 comments

This glob library is used for a critical bit of functionality in Telegraf -metric filtering. There appear to be a number of cases where it silently fails to match, and one which has caused us much grief recently involves multiple * wildcards inside a { } alternate construct.

Telegraf is configured with a list of patterns for a namepass/namedrop function, and internally it composes the list into a single pattern with alternates. I've reduced our test case to one similar to the samples in glob_test.go; here is the failing test surrounded by variations which all work:

glob(true, "yandex:*.exe:page.*", "yandex:service.exe:page.12345"),
glob(true, "*yandex:*.exe:page.*", "yandex:service.exe:page.12345"),
glob(true, "{*yandex:*.exe:page.*}", "yandex:service.exe:page.12345"),
glob(true, "{google.*,yandex:*.exe:page.*}", "yandex:service.exe:page.12345"),
glob(true, "{google.*,*yandex:*.exe:page.*}", "yandex:service.exe:page.12345"), // FAIL
glob(true, "{google.?,*yandex:*.exe:page.*}", "yandex:service.exe:page.12345"),
glob(true, "{google.*,*yandex:service.exe:page.*}", "yandex:service.exe:page.12345"),
glob(true, "{google.*,*yandex:*.exe:*.12345}", "yandex:service.exe:page.12345"),

The result of running this test in current master branch is:

--- FAIL: TestGlob (0.00s)
    --- FAIL: TestGlob/#64 (0.00s)
        glob_test.go:190: pattern "{google.*,*yandex:*.exe:page.*}" matching "yandex:service.exe:page.12345" should be true but got false
            <btree:[<nil><-<any_of:[<text:`google.`>,<btree:[<contains:[yandex:]><-<text:`.exe:page.`>-><nil>]>]>-><super>]>
FAIL
exit status 1
FAIL    github.com/gobwas/glob  0.004s

This has become a huge problem for us, with huge numbers of metrics being sent to InfluxDB which over time overwhelm it (and post-hoc deletions are unusably slow).

j3h commented

I ran into a bug that may or may not be the same, and created a minimal reproduction. Short version is that {,*}x works fine, but {*,}x does not.

I also ran into this in Bud where the following test was failing (in this case matching unexpectedly):

func TestMatchSubdir(t *testing.T) {
	is := is.New(t)
	matcher, err := glob.Compile(`{generator/**.go,bud/internal/generator/*/**.go}`)
	is.NoErr(err)
	is.True(!matcher.Match("bud/internal/generator/generator.go"))
}

I was able to fix this by manually expanding the globs into generator/**.go and bud/internal/generator/*/**.go.

Then you can compile each one individually and create a matcher:

// Comple
func Compile(pattern string) (Matcher, error) {
	patterns, err := Expand(pattern)
	if err != nil {
		return nil, err
	}
	globs := make(globs, len(patterns))
	for i, pattern := range patterns {
		glob, err := glob.Compile(pattern)
		if err != nil {
			return nil, err
		}
		globs[i] = glob
	}
	return globs, nil
}

type globs []glob.Glob

func (globs globs) Match(path string) bool {
	for _, glob := range globs {
		if glob.Match(path) {
			return true
		}
	}
	return false
}