vidcaption_aavan: A Python repository from AISpeechlab @ NTU

Video Captioning

Program to generate captions for the keyframes of a video, given a video file as input.

Installation

Usage steps for Ubuntu 16.04:

Downloads

Download/clone this repository
Download the file "model_checkpint.pth.tar" from https://drive.google.com/file/d/1OMnMuMuxEtKVmws2zNAlB3nhCVnwUTS4/view?usp=sharing
Place the file "model_checkpint.pth.tar" in the repo directory
model checkpoint source: https://github.com/sgrvinod/a-PyTorch-Tutorial-to-Image-Captioning

Ubuntu python setup

sudo apt-get update
sudo apt-get install -y build-essential tk-dev libncurses5-dev libncursesw5-dev libreadline6-dev libdb5.3-dev libgdbm-dev libsqlite3-dev libssl-dev libbz2-dev libexpat1-dev liblzma-dev zlib1g-dev libffi-dev tar wget vim
cd /opt
sudo wget https://www.python.org/ftp/python/3.8.5/Python-3.8.5.tgz
sudo tar xzf Python-3.8.5.tgz
cd Python-3.8.5
sudo ./configure --enable-optimizations
sudo make -j 4
sudo make altinstall
cd /opt
sudo rm -f Python-3.8.5.tgz

Ubuntu setup

Change Directory to vidcaption_aavan
python3.8 -m pip install virtualenv
virtualenv -p python3.8 venv-vidcap
source venv-vidcap/bin/activate

Installing requirements

pip install -r requirements.txt

pip install pandas
pip install torch
pip install opencv-python
pip install click
pip install torchvision
pip install matplotlib
pip install scikit-image
pip install PyTube (for video downlaod)
pip install Flask (for API)
pip install pycocotools (for evaluation)
pip install pycocoevalcap (for evaluation)

Video Download

Videos can be downloaded using the PyTube based video downloading script. Videos downloaded using this method are automatically saved as mp4 to the "video_uploads" folder.

Run from terminal using $ python download_video.py <youtube_URL> <name_to_save_as_without_extension>
Example: python download_video.py https://www.youtube.com/watch?v=DocxmW2bOdc&t=80s singapore_dorm_cases

Video Captioning

Video keyframe extraction supports most video formats including: mp4, ts, MOV, avi, y4m, mkv, flv, wmv.

Running without API:

Activate venv: $ source venv-vidcap/bin/activate
Put videos to caption in "video_uploads" folder
Run from terminal using $ python caption_video.py <videofile_name>
Example: python caption_video.py elsa.mp4
To keep the video frames, run from terminal using $ python caption_video.py <videofile_name> keepframes
Example: python caption_video.py elsa.mp4 keepframes

Running with API:

Activate venv: $ source venv-vidcap/bin/activate
Run from terminal using $ python api_start.py
Go to http://127.0.0.1:5001/captionvideo on browser
Browse disk for video file
Click upload
Uploaded videos will be saved to video_uploads directory

Evaluation

If not evaluating, eval directory can be removed.

To caption and evaluate against custom dataset in COCO format:

pip install pycocoevalcap
Move model checkpoint file and wordmap into "eval" directory (default: model_checkpint.pth.tar and wordmap.json)
Put folder with images to caption into "eval" directory
Change directory to "eval"
Ensure caption file in COCO format(for images to be captioned) is in "eval" directory, eg: "DATASET_coco_captions.json"
Run from terminal using $ python caption_and_eval.py <directory_name>
Example: python caption_and_eval.py DATASET
To keep output file of captioning, run from terminal using $ python caption_and_eval.py <directory_name> keepoutput
Example: python caption_and_eval.py DATASET keepoutput

To evaluate against video captions in COCO format, using video captioning output(output from caption_video.py):

Move the output file eg: covid.mp4-OUTPUT.json to eval directory
Change directory to "eval"
Ensure caption file in COCO format is in "eval" directory, eg: "covid.mp4-coco_captions.json"
Run from terminal using $python eval_video_output.py <videofile_name>
Example: python eval_video_output.py covid.mp4

ntuspeechlab/vidcaption_aavan

Video Captioning

Installation

Video Download

Video Captioning

Evaluation