This repository contains the source code for ensemble learning CNN video Source Identification. VISION dataset is required to run the experiment without any changes in code.
- Download videos from VISION keeping its ooriginal structure.
Dataset folder contains the code to extract the frames and prepare train and test dataset.
For the extraction of frames FFMPEG is used. Beofre running, install FFMPEG, and change "ffmpeg_path" variable inside "dataset/frames_extraction/iframe.py" file to the absolute path of ffmpeg
- To extract the I-frames from each video and save it locally run dataset/frames_extraction/extract_video_frames.py
python3 dataset/frames_extraction/extract_video_frames.py --vision_dataset_path="absolute/path/to/VISION/dataset" --frames_output_path="absolute/path/to/save/extracted/i-frames"
- Run dataset/create_train_test_dataset.py to split extracted frames into test and training datasets
python3 dataset/create_train_test_dataset.py --frames_input_dataset="absolute/path/to/directory/with/extracted/frames" --test_train_dataset_folder="absolute/path/to/save/train/test/dataset"
- Run dataset/patch_extractor.py to split extracted frames into test and training datasets. It is required to run it twice to extract patches for train dataset and for test dataset.
python3 dataset/patch_extractor.py --test_train_dataset_folder="absolute/path/to/directory/with/train/test/dataset" --quadrants_output_folder="absolute/path/to/save/extracted/patches" --test_train="test"
python3 dataset/patch_extractor.py --test_train_dataset_folder="absolute/path/to/directory/with/train/test/dataset" --quadrants_output_folder="absolute/path/to/save/extracted/patches" --test_train="train"
CNN_base_leraners folder contains the code to train and save CNN models for four quadrants.
- Run CNN_base_learners/train_CNN_branch.py to create, train, and save models. It needs to be run 4 times for each quadrant.
Possible values for "--quadrant" parameter: quadrant_1, quadrant_2, quadrant_3, quadrant_4
Recommendation: set the model path to the directory that can be quickly and easily found. Saved models files are required for the next section of steps.
python3 CNN_base_learners/train_CNN_branch.py --ds_path="absolute/path/to/directory/with/folder/containing/all/quadrants/data" --model_path="absolute/path/to/save/trained/models"
--tensor_flow_path="absolute/path/to/save/tensors" --quadrant="quadrant_numer"
-
File CNN_base_learners/prepare_dataset_for_network.py contains the code to get the filenames for train and test dataset
-
File CNN_base_learners/CNN_branch_data_generator.py contains the code to generate input for the CNN during the training using the datasets generated by CNN_base_learners/prepare_dataset_for_network.py
-
File CNN_base_learners/cnn_network.py contains the CNN architecture and training code
The code saves the best model during the training inside the folder --model_path/quadrant_X, where X indicates the quadrant number, 1, 2, 3, or 4.
ensemble_CNN folder contains the code to run the ensembled CNN. It also saves csv file containing predictions to allow performance evaluation.
- Run ensemble_CNN/run_ensemble_network.py to create ensemble and get the final predictions.
For "--path_to_directory=" parameter use absolute path of this github repository on your computer. It should ends on "cnn_ensemble_vsi".
For path to the model parameter, the argument is absolute file to the ".h5" file containing the best model (last save model).
python3 ensemble_CNN/run_ensemble_network.py --quadrant_1_model_path="absolute/path/to/model_file/for/quadrant/1" --quadrant_2_model_path="absolute/path/to/model_file/for/quadrant/2" --quadrant_3_model_path="absolute/path/to/model_file/for/quadrant/3" --quadrant_4_model_path="absolute/path/to/model_file/for/quadrant/4" --ds_path_quadrant_1="absolute/path/to/folder/containing/quadrant/1/patches" --ds_path_quadrant_2="absolute/path/to/folder/containing/quadrant/2/patches" --ds_path_quadrant_3="absolute/path/to/folder/containing/quadrant/3/patches" --ds_path_quadrant_4="absolute/path/to/folder/containing/quadrant/4/patches" --path_to_directory="absolute/path/to/directory/containg/this/github/repository"
- The final predictions for further evaluation will be saved in the following directory "--path_to_directory/statistics" with the file names "ensemble_predictions.csv"