/AudioVideoDescriptiveAI

Dynamically describes video content by embedding temporal and audio context into visual data using an audio classification neural network (PANNs), utilizing OpenAI's GPT-4o model.

Primary LanguageJupyter Notebook

This repository is not active