Iemontine/AudioVideoDescriptiveAI
Dynamically describes video content by embedding temporal and audio context into visual data using an audio classification neural network (PANNs), utilizing OpenAI's GPT-4o model.
Jupyter Notebook
No issues in this repository yet.