Using-Video-summarization-techniques-for-effective-search-indexing: A Jupyter Notebook repository from msg4rajesh

Video Summarization Techniques for effective search indexing

Simple! I will break this into 5 steps:

We use Video summarization techniques to extract the short summary that is descriptive of the video.
We then extract the keyframes using histogram analysis
We then generate the image captions for each keyframe.
We then remove the stop words, to get the final keywords that are descriptive of the video that can be used for indexing purposes!

Video summarization using DSNet - Zhu, Wencheng, et al. "Dsnet: A flexible detect-to-summarize network for video summarization." IEEE Transactions on Image Processing 30 (2020): 948-962.
Image captioning using ExpansionNetV2 - Hu, Jia Cheng, Roberto Cavicchioli, and Alessandro Capotondi. "ExpansionNet v2: Block Static Expansion in fast end to end training for Image Captioning." arXiv preprint arXiv:2208.06551 (2022). ( Can also be swapped for ClipClap - Mokady, Ron, Amir Hertz, and Amit H. Bermano. "Clipcap: Clip prefix for image captioning." arXiv preprint arXiv:2111.09734 (2021).)

\nodeserver consists of the all the backend

\nodeserver\pythonscripts\DSNet consists of the DSNet model

\nodeserver\pythonscripts\ExpansionNet consists of ExpansionNetV2 model

\nodeserver\pythonscripts\imagecaption consists of CLIPCLAP model.

In order to switch between ExpansionNet or CLIPCLAP for Image captioning, modify this line.

Download rf_model.pth from here and place it in nodeserver\pythonscripts\ExpansionNet
Download model_weights.pt from here and place it in nodeserver\pythonscripts\imagecaption\model
npm install to install required modules
pip install -r requirements.txt to install python modules
npm start to start the electron app!

The keywords for video will be written to nodeserver\pythonscripts\DSNet\outputs\captions.txt

For detailed evaluations, please refer to comparisons.ipynb. Evaluations include: