amplab/succinct

Add regexMatch to SuccinctKVRDD

anuragkh opened this issue · 1 comments

SuccinctKVRDD currently supports a regexSearch which returns an RDD of keys for documents that contain matches for a regular expression. We should add support for a regexMatch method as follows:

def regexMatch(query: String): RDD[(K, RegExMatch)]

where each RegExMatch encapsulates:

  • The offset into the value for the match
  • The length of the match

We already have a similar method in SuccinctRDD, it should be a simple translation to SuccinctKVRDD.

Added in v0.1.6