Can I get scan results in a non-callback manner?
SergeyPiskunov opened this issue · 3 comments
SergeyPiskunov commented
Hi! I'm curious, is there any ability to get scan results either synchronously without passing a callback or as "awaitiable" object of an asyncio event loop?
darvid commented
Bear in mind that when db.scan
returns in Python, all the callbacks for any potential matches are guaranteed to have been invoked. So you can make calls to persist the results somewhere (global state, database, future, whatever) and just pull them immediately after scanning.
With that said, I think using context to store results is useful in this scenario, assuming you don't have a huge amount of patterns or expected matches. For example:
import collections
import typing
import hyperscan
HsPattern = collections.namedtuple('HsPattern', ['pattern', 'id', 'flags'])
HsResult = collections.namedtuple('HsResult', ['id', 'start', 'end', 'flags'])
PATTERNS = (
HsPattern(br'fo+', 0, 0),
HsPattern(br'^foobar$', 1, hyperscan.HS_FLAG_CASELESS),
HsPattern(
br'BAR',
2,
hyperscan.HS_FLAG_CASELESS | hyperscan.HS_FLAG_SOM_LEFTMOST,
),
)
def on_match(
id: int,
start: int,
end: int,
flags: int,
context: typing.Optional[typing.Any] = None,
) -> typing.Optional[bool]:
context['results'].append(HsResult(id, start, end, flags))
return 0
def create_database(patterns: typing.Tuple[HsPattern]) -> hyperscan.Database:
db = hyperscan.Database()
expressions, ids, flags = zip(*patterns)
db.compile(
expressions=expressions, ids=ids, elements=len(patterns), flags=flags
)
return db
def main() -> None:
db = create_database(PATTERNS)
context = {'results': []}
db.scan(b'foobar', match_event_handler=on_match, context=context)
for result in context['results']:
print(result)
if __name__ == '__main__':
main()
$ python context_results.py
HsResult(id=0, start=0, end=2, flags=0)
HsResult(id=0, start=0, end=3, flags=0)
HsResult(id=2, start=3, end=6, flags=0)
HsResult(id=1, start=0, end=6, flags=0)
SergeyPiskunov commented
Yeah.. Seems that passing "context" will be sufficient for me. Thanks a lot!