antiboredom/videogrep

Unexpected behaviour with "fragment" option?

smithee77 opened this issue · 4 comments

Hi first of all many thanks for this amazing tool. I'm working with --transcribe, using VOSK. Once transcription is done, if for example I run:
videogrep --input shell.mp4 --search-type fragment --search 'and'

I got video chunks with "and" instances, but also those with "sand", "wand", "random", ...every word CONTAINING 'and'.
Is this the expected behaviour? I just wanna get exact "AND" chunks.

Thanks again

Sorry didn't read https://lav.io/notes/videogrep-tutorial/
To exact word, use pattern '^word_to_find$'

Hi - glad you like it!

Yes that actually is the expected behavior (and I should probably make this more clear in the documentation). Searching uses Python regular expressions, so you'll get back anything that contains the characters you've entered. If you want to get an exact word match, you can try something like this:

videogrep --input shell.mp4 --search-type fragment --search '^and$'

The ^ characters means "beginning of string" and $ means "end of string", so wrapping any search between ^ and $ gets the exact word.

Let me know if you have any more questions!

@smithee77 glad you figured it out! And if you have suggestions for making this more clear to future confused users please let me know :)

Many thanks for your reply!! Found the answer like one second before your answer! :))
Yeah, just as a suggestion, default behaviour maybe should be EXACT pattern? Probably it is what main users want to use the tool for...
Amazing work you do