/Indoor-Scene-Recognition

The dataset has a collection of about 15000+ labeled images belonging to 67 categories. I am selecting only below 10 categories ⚰ airport_inside, auditorium, bakery, bathroom, bookstore, casino, church_inside, grocerystore, operating_room, warehouse Objective is to create a model that will able to classify images into these 10 categories.

Primary LanguageJupyter Notebook

Indoor-Scene-Recognition

The dataset has a collection of about 15000+ labeled images belonging to 67 categories. I am selecting only below 10 categories :

airport_inside, auditorium, bakery, bathroom, bookstore, casino, church_inside, grocerystore, operating_room, warehouse Objective is to create a model that will able to classify images into these 10 categories.

Approach:

  • Created training and validation dataset with required preprocessing and vizualized training dataset. download (1)

  • Created Data Augmentationlayer using appropiate augmentation techniques and vizualized the augmented data. download

  • Loaded data into cache to overcome data bottleneck during training. Also Shuffed data before starting of each epoch.

  • First trying transfer learning using InceptionV3 architecture with imagenet pretrained weights. I have removed the default output softmax layer of InceptionV3. I have kept First 249 layers weights as it is. Trained layer 249 to last layer on training dataset. Added a Flatten layer, Dropout layers, 2 hidden dense layers and output dense layer with 10 neurons and softmax as activation. I will use label encoding instead of one hot encoding to optimize memory utilization. So my loss function will be: sparse_categorical_entropy and my metric will be sparse_categorical_accuracy. download (2)

    -Best Epoch: 109 (model-00109-0.08159-0.99170-0.67942-0.85068.h5) -Training: loss: .08 - categorical_accuracy: 0.99 -Validation: val_loss: .68 - val_sparse_categorical_accuracy: 0.85

  • Used almost same architecture as previous one, only used Xception as our cnn architecture instead of InceptionV3. Removed output softmax layer of Xception architecture and finetunned Xception from layer 114 to end layer. Then I have added custom layers as previous. download3

    • Best Epoch: 88 (model-00088-0.07750-0.99321-0.49403-0.89140.h5)
    • Training: loss: .775 - categorical_accuracy: 0.993
    • Validation: val_loss: .494 - val_sparse_categorical_accuracy: 0.891

Inference

Created inference function and used both the models wioth best weights and calculated average probability value for all 10 classes from these 2 models and that average probability is used for final classification. image

image

image

image

Final Model files:

Model1: https://drive.google.com/file/d/18KDv8_qoJDI0osEiuqaOlYZbQe1oKptn/view?usp=sharing (390 MB)

Model2: https://drive.google.com/file/d/18mtLaKrAA2Dthas2k6hnkjjJXYPPbRMV/view?usp=sharing (737 MB)