/natural-language-youtube-search

Search inside YouTube videos using natural language

Primary LanguageJupyter NotebookMIT LicenseMIT

Natural Language YouTube Search

Open In Colab

Use OpenAI's CLIP neural network to search inside YouTube videos. You can try it by running the notebook on Google Colab.

How it works

  1. Download the YouTube video
  2. Extract every N-th frame
  3. Encode all frames using using CLIP
  4. Encode a natural language search query using CLIP
  5. Find the images that best match the search query

For more details see the notebook.

Examples

Here are some example searches from this YouTube video of a car driving around San Francisco.

"A fire truck"

Search results for "A fire truck" Search results for "A fire truck" Search results for "A fire truck"

"Road works"

Search results for "Road works" Search results for "Road works" Search results for "Road works"

"People crossing the street"

Search results for "People crossing the street" Search results for "People crossing the street" Search results for "People crossing the street"

"The Embarcadero"

Search results for "The Embarcadero" Search results for "The Embarcadero" Search results for "The Embarcadero"

"Waiting at the red light"

Search results for "Waiting at the red light" Search results for "Waiting at the red light" Search results for "Waiting at the red light"

"Green bike lane"

Search results for "Green bike lane" Search results for "Green bike lane" Search results for "Green bike lane"

"A street with tram tracks"

Search results for "A street with tram tracks" Search results for "A street with tram tracks" Search results for "A street with tram tracks"

"The Transamerica Pyramid"

Search results for "The Transamerica Pyramid" Search results for "The Transamerica Pyramid" Search results for "The Transamerica Pyramid"

Natural language search on Unsplah

You can also try my other project to search from 2M photos on Unsplash using natural language queries: