(Enhancement?) Include pre/post silence in word detection

Question

(Enhancement?) Include pre/post silence in word detection

Opened this issue 2 years ago · 2 comments

Hi, it could be cool if somehow the tool could add the previous silent to the word before it's extracted (in the "segment" option). This can be implemented as an extra option, maybe.
Of course this is not an issue, just a suggestion.

As usual, great repo!

Answer 1 · 2022-11-28T22:28:24.000Z

Great idea! I'll definitely consider adding it (although a bit behind on new features at the moment). There also might be a nice way to make a little extra python script that does this...

Answer 2 · 2023-12-23T00:58:34.000Z

I also had this issue and came up with a modified version of @antiboredom's examples.

So this might help you getting what you want.
word2["word"] is the word you're looking for.

import sys
from videogrep import parse_transcript, create_supercut

# the min and max duration of silences to extract
min_duration = 0.2
max_duration = 5.0

# value to trim off the end of each clip
adjuster = 0.05

filenames = sys.argv[1:]

words_with_silences = []
for filename in filenames:
    timestamps = parse_transcript(filename)

    # this uses the words, if available
    words = []
    for sentence in timestamps:
        words += sentence['words']

    # for word1, word2 in zip(words[:-2], words[1:]): <- I think this from example skips the last entity
    for word1, word2, word3 in zip(words, words[1:], words[2:]):
        if not word2['word'] == "retirement":
            continue
        first_start = word1['end']
        first_end = word2['start']  # - adjuster
        first_silence = first_end - first_start

        second_start = word2['end']
        second_end = word3['start'] - adjuster
        second_silence = second_end - second_start

        if (min_duration < first_silence < max_duration) and (min_duration < second_silence < max_duration) :
            print(f'The word {word2["word"]} was surrounded by {first_silence} and {second_silence} of silence.')
            words_with_silences.append({'start': first_start, 'end': second_end, 'file': filename})

create_supercut(words_with_silences, 'words_with_silences.mp4')

This yields for example:

The word retirement was surrounded by 2.009999999999309 and 1.8100000000004002 of silence.

and the cut was excellent 😄

@antiboredom how would I create the splitted files as it's being done with the flag export_clips?
edit: nvm -> videogrep.export_individual_clips(words_with_silences, 'words_with_silences.mp4')