/streamregex

Scan io.Reader and get matching data from a regex rule. Basically Find() for a reader

Primary LanguageGoMIT LicenseMIT

Streamregex

Go Report Card Documentation

Go does not let you find the matched data of a regex from a stream of data. They let you find the first position of a match, but not the data itself.

Streamregex allows you to get a channel of the matched data of a regex on a io.Reader stream.

Usage

// Create string
data := `0123456789this is a stream    of data with lots of trailing information`
stream := strings.NewReader(data)

// Build regex
regex := regexp.MustCompile(`stream\s+of`)

// Find matches
matchedData := FindReader(context.Background(), regex, 100, stream)
for match := range matchedData {
    fmt.Println(match)
}

// Output: stream    of

How it works

We use a custom SplitFunc to split the reader into each regex match. Normally for a SplitFunc it will keep reading more and more data into the buffer until it finds a match. To avoid pulling all the reader data into memory, the function accepts a maxMatchLength if you know the maximum match length of a match.

Note that we need to allocate a maxMatchLength*2 bytes of memory to successfully scan the reader for matches.