/CLIFS-Contrastive-Language-Image-Forensic-Search

CLIFS (CLIP-based Frame Selection) is a Python function that takes in a video file and a text prompt as input, and uses the CLIP (Contrastive Language-Image Pre-training) model to find the frame in the video that is most similar to the given text prompt.

Primary LanguagePython

Watchers