Cusco Building Image Dataset (CuscoBID)

We worked on a project that aims to recognize building historic from images in the city of Cusco in Peru. Building recognition from images is a challenging task since pictures can be taken from different angles and under different illumination conditions. An additional challenge is to differentiate buildings with a similar architectural design.

We compare the baseline method Bag-of-Words (We using SIFT and SURF to feature extraction) and the proposed CNN-based method (We propose a transfer learning approach using the models Vgg16, Vgg19, Inception-V3 and Xception to feature extraction).

Contributions are welcome. If you went to Cusco you can send us your photos to increase our dataset. Send to email 120885@unsaac.edu.pe

Dataset
Requirements
Data Preparation
Bag-of-Words
Transfer Learning
Prediction
Publication
Citation

Dataset

First Version. This consists of 2000 images of 14 different historical buildings of the city of Cusco. Next, the class number and the name of the historic building that corresponds to it is presented

Class label	Building name	Image number
01	Casa del Inca Garcilaso	108
02	Catedral del Cusco	159
03	La Compañia de Jesus	176
04	Coricancha	147
05	Cristo Blanco	146
06	Templo de la Merced	142
07	Mural de la Historia Inca	137
08	Paccha de Pumaqchupan	114
09	Pileta de San Blas	139
10	Inca Pachacutec	129
11	Sacsayhuaman	190
12	Iglesia de San Francisco	135
13	Iglesia de San Pedro	146
14	Iglesia de Santo Domingo	132

Second Version. This consists of 4560 images of 14 different historical buildings of the city of Cusco. Next, the class number and the name of the historic building that corresponds to it is presented

Class label	Building name	Image number
01	Casa del Inca Garcilaso	200
02	Catedral del Cusco	600
03	La Compañia de Jesus	600
04	Coricancha	570
05	Cristo Blanco	200
06	Templo de la Merced	260
07	Mural de la Historia Inca	200
08	Paccha de Pumaqchupan	200
09	Pileta de San Blas	320
10	Inca Pachacutec	300
11	Sacsayhuaman	350
12	Iglesia de San Francisco	250
13	Iglesia de San Pedro	280
14	Iglesia de Santo Domingo	230

Requirements

To Transfer Learning
- Python 2.7
- Tensorflow 1.0.0
- Keras 2.0.2
- Matplotlib 2.0.0.
To Bag-of-Words
- Opencv 2.4.11
- NumPy 1.12.0
- SciPy 0.18.1
- SciKitLearn 0.18.1

Data Preparation

Format

Change format to class_imagenumber.jpg. If you use Ubuntu you can execute the following sentence:

//enter to folder catedral and run; output 01_0001.jpg, 01_0002.jpg, ....

ls *.jpg | awk 'BEGIN{ class=1; photo=1; }{ printf "mv "%s" %02d_%04d.jpg\n", $0, class, photo++ }' | bash

//enter to folder coricancha and run; output 02_0001.jpg, 02_0002.jpg, ....

ls *.jpg | awk 'BEGIN{ class=2; photo=1; }{ printf "mv "%s" %02d_%04d.jpg\n", $0, class, photo++ }' | bash

//enter to folder garcilaso and run; output 03_0001.jpg, 01_0003.jpg, ....

ls *.jpg | awk 'BEGIN{ class=3; photo=1; }{ printf "mv "%s" %02d_%04d.jpg\n", $0, class, photo++ }' | bash

...
Join Data

Copy all the images of the folders to a new folder (where we will leave all the images), we recommend the name of "dataset_cus".
Split Train and Test

We created N folders and randomdly split the dataset in train and test. Run script

python split_dataset.py ~/Path-to-original-dataset/ nsplits perc_train ~/Path-to-output-dataset/

Where:
- ~/Path-to-original-dataset/ : Directory of input images(dataset_cus)
- nsplits : Number of folders splits
- perc_train : Percentage of train samples
- ~/Path-to-output-dataset/ : Output path

Bag-of-Words

Build Codebook

We used SURF to feature extraction. However, if you want to use another algorithm (for example: SIFT). You must change the line of code 'desc_method = cv2.SURF()' for 'desc_method = cv2.SIFT()' in script bovw_utils.py. Finally, to create the codebook, you need to run.

python codebook.py ~/Path-to-train-dataset/ codebook_size codebook_method ~/Path-to-output-dataset/

Where:
- ~/Path-to-train-dataset/ : Directory of input images(train)
- codebook_size : Size of the dictionary
- codebook_method : Codebook method ('random, kmeans, st_kmeans, fast_st_kmeans). We we recommend using fast_st_kmeans because it is faster
- codebook_filename : Output file(*.npy)
Build Bag-of-visual-Words

This script needs to be executed twice to extract bag of visual words. one for the train data and the other for the test data.

python bovw.py ~/Path-to-train-dataset-train/ codebook_filename output_bovw_filename_train output_labels_filename_train

Where:
- ~/Path-to-train-dataset/ : Directory of input images(train)
- codebook_filename : Codebook file (*.npy)
- output_bovw_filename_train : Output file(*.npy) with visual words
- output_labels_filename_train : Output file(*.npy) with labels of visual words
python bovw.py ~/Path-to-train-dataset-test/ codebook_filename output_bovw_filename_test output_labels_filename_test

Where:
- ~/Path-to-train-dataset-test/ : Directory of input images(test)
- codebook_filename : Codebook file (*.npy)
- output_bovw_filename_test : Output file(*.npy) with visual words
- output_labels_filename_test : Output file(*.npy) with labels of visual words
Classification

We use four different classification methods. Support Vector Machine, Random Forest and k Nearest Neighbor are in the script classify_train_test.py.

python classify_train_test.py dataset_train_filename labels_train_filename dataset_test_filename labels_test_filename method output_filename

Where:
- dataset_train_filename : equals to output_bovw_filename_train, Dataset train file name (*.npy)
- labels_train_filename : equals to output_labels_filename_train, Label train filename (*.npy)
- dataset_test_filename : equals to output_bovw_filename_test, Dataset test file name (*.npy)
- labels_test_filename : equals to output_labels_filename_test, Label test filename (*.npy)
- method : Classifier (svm, linear_svm, rf, knn), where svm is equals to SVM with kernel RBF and linear_svm is equals to SVM with kernel lineal
- output_filename : Predicted output filename(*.npy)
While Neural Network is executed in script cnn_test_tinc3.py(you can modify parameters such as the number of neurons, number of layers and others).

python classify_train_test.py file_path_train file_path_train_cls file_path_test file_path_test_cls file_path_save_model

Where:
- file_path_train : equals to output_bovw_filename_train, Dataset train file name (*.npy)
- file_path_train_cls : equals to output_labels_filename_train, Label train filename (*.npy)
- file_path_test : equals to output_bovw_filename_test, Dataset test file name (*.npy)
- file_path_test_cls : equals to output_labels_filename_test, Label test filename (*.npy)
- file_path_save_model : Predicted output filename(*.ckpt)

Transfer Learning

Compute Transfer Values

We use different pre-trained models of convolutional neural networks, these architectures were provided by the framework Keras (VGG16, VGG19, Xception) and Magnus Erik Hvass Pedersen(Inception-V3).

VGG16, VGG19 and Xception.

Pre-trained weights can be automatically loaded upon instantiation. Weights are automatically downloaded if necessary, and cached locally in ~/.keras/models/.

Inception-V3

First, you must download the pre-trained model of http://download.tensorflow.org/models/image/imagenet/inception-2015-12-05.tgz. Second, unzip the model within the 'CNN-Transfer Learning' folder. Third, modify the model's path in the 'inception.py' file, for example: data_dir = "/home/jeanfranco/Documents/deep-learning-models_proy/inception-2015-12-05/".

you need to execute the following script twice, one for the train data and the other for the test data.

To train.

python compute_transfer_values.py ~/Path-to-train-dataset-train/ dataset_type model_type data_augmentation output_data_train output_cls_train

Where:

~/Path-to-train-dataset-train/ : Directory of input images(train)
dataset_type : choose method ('train',' test'), we recommend train
model_type : choose model type ('vgg16', 'vgg19', 'resnet', 'xception','inception')
data_augmentation : choose ('si', 'no')
output_data_train : Output transfer values (.npy)
output_cls_train : Output classes (.npy)

To test.

python compute_transfer_values.py ~/Path-to-train-dataset-test/ dataset_type model_type data_augmentation output_data_test output_cls_Test

Where:

~/Path-to-train-dataset-train/ : Directory of input images(test)
dataset_type : choose method ('train',' test'), we recommend test
model_type : choose model type ('vgg16', 'vgg19', 'resnet', 'xception','inception')
data_augmentation : choose ('no')
output_data_test : Output transfer values (.npy)
output_cls_Test : Output classes (.npy)
Classification

We use four different classification methods. Support Vector Machine, Random Forest and k Nearest Neighbor are in the script classify_train_test.py.

python classify_train_test.py output_data_train output_cls_train output_data_test output_cls_Test method output_filename

Where:

output_data_train : Dataset train file name (*.npy)
output_cls_train : Label train filename (*.npy)
output_data_test : Dataset test file name (*.npy)
output_cls_Test : Label test filename (*.npy)
method : Classifier (svm, linear_svm, rf, knn), where svm is equals to SVM with kernel RBF and linear_svm is equals to SVM with kernel lineal
output_filename : Predicted output filename(*.npy)

While Neural Network is executed in script cnn_test_tinc3.py(you can modify parameters such as the number of neurons, number of layers and others).

python classify_train_test.py output_data_train output_cls_train output_data_test output_cls_Test file_path_save_model

Where:

output_data_train : Dataset train file name (*.npy)
output_cls_train : Label train filename (*.npy)
output_data_test : Dataset test file name (*.npy)
output_cls_Test : Label test filename (*.npy)
file_path_save_model : Predicted output filename(*.ckpt)

Prediction

We develop a prediction from the new trained model. That is, we identify the category of belonging to a new image of a historic building in the city of Cusco.

We have three cells in 'Prediction_models.ipynb', The first line, the machine learning algorithm is applied neural network, in the method 'main' 'saver = tf.train.import_meta_graph('model.ckpt.meta')' depends on the name of the previously generated file. The second line, represents a visualization of applying data augmentation. Finally, the machine learning algorithm is Support Vector Machine. We recommend using the last cell during the prediction process, because it presents the best results during this stage.

Publication

You can review our paper, published in the IEEE Xplore Digital Library

Tittle: Towards accurate building recognition using convolutional neural networks
Authors:
- Jeanfranco D. Farfan-Escobedo. Escuela Profesional de Ingeniería Informática y de Sistemas, Universidad Nacional de San Antonio Abad del Cusco, Peru
- Lauro Enciso-Rodas. Escuela Profesional de Ingeniería Informática y de Sistemas, Universidad Nacional de San Antonio Abad del Cusco, Peru
- John E. Vargas-Muñoz. Institute of Computing - University of Campinas, Campinas, Brazil

Citation

Finally, if you use the database or the code, do not forget to reference this work.

@inproceedings{farfan2017towards, title={Towards accurate building recognition using convolutional neural networks}, author={Farfan-Escobedo, Jeanfranco D and Enciso-Rodas, Lauro and Vargas-Mu{~n}oz, John E}, booktitle={Electronics, Electrical Engineering and Computing (INTERCON), 2017 IEEE XXIV International Conference on}, pages={1--4}, year={2017}, organization={IEEE} }

jeanfranc0/CuscoBID

Cusco Building Image Dataset (CuscoBID)

Contents

Dataset

Requirements

Data Preparation

Bag-of-Words

Transfer Learning

Prediction

Publication

Citation