LineMatches sometimes contain newlines
ijt opened this issue · 1 comments
ijt commented
The doc comment for the LineMatch
type says
// LineMatch holds the matches within a single line in a file.
but it currently does not act as advertised.
Here is a test that runs a query containing a newline. It expects the resulting FileMatch
structure to contain two LineMatch
es, one per line, but instead it gets back a single LineMatch
containing two lines.
func TestQueryNewlines(t *testing.T) {
b := testIndexBuilder(t, nil,
Document{Name: "filename", Content: []byte("line1\nline2\nbla")})
sres := searchForTest(t, b, &query.Substring{Pattern: "ine2\nbla"})
matches := sres.Files
want := []FileMatch{{
FileName: "filename",
LineMatches: []LineMatch{
{
LineFragments: []LineFragmentMatch{{
Offset: 7,
LineOffset: 1,
MatchLength: 4,
}},
Line: []byte("line2"),
LineStart: 6,
LineEnd: 11,
LineNumber: 2,
},
{
LineFragments: []LineFragmentMatch{{
Offset: 13,
LineOffset: 0,
MatchLength: 3,
}},
Line: []byte("bla"),
LineStart: 13,
LineEnd: 16,
LineNumber: 3,
},
}}}
if !reflect.DeepEqual(matches, want) {
t.Errorf("got %v, want %v", matches, want)
}
}
Here is the output of the test:
[ ~/src/github.com/ijt/zoekt ] go test ./...
--- FAIL: TestQueryNewlines (0.00s)
index_test.go:214: got [{0 filename [] [{[108 105 110 101 50 10 98 108 97] 6 15 2 false 0 [{1 7 8}]}] [] [] }], want [{0 filename [] [{[108 105 110 101 50] 6 11 2 false 0 [{1 7 4}]} {[98 108 97] 13 16 3 false 0 [{0 13 3}]}] [] [] }]
FAIL
FAIL github.com/google/zoekt 0.067s
Fixing this would make it much easier for Sourcegraph to support multiline searches.
I'm happy to contribute a fix.
ijt commented
I have a fix for this: https://github.com/google/zoekt/compare/master...ijt:newlines-one?expand=1. I know it's meant to be done through Gerrit. I'll do that next.