/stanford-cars-classification

Image classification on stanford-car dataset with high accuracy

Primary LanguagePython

Stanford-cars classification

In this repository, I'm making a cars classifier using the Stanford cars dataset, which contains 196 classes(including make and model). This repository also contains the checkpoint of 13 models trained on Stanford-cars dataset with high accuracy. You can use it as pretrained weights then transfer learning to other dataset.
Ensemble of some models in this repository can achieve accuracy 0.9462, higher accuracy than state-of-the-art stanford cars 2018 (0.945) and nearly state-of-the-art image classification on stanford cars 2019 (0.947)

Environments

  • Ubuntu 16.04 LTS
  • Cuda 10.0, cuDNN v7.5.0
  • Python 3.5, Keras 2.2.4, Tensorflow 1.13.1, Efficientnet
  • Quick install dependencies:
    $ pip install --upgrade -r requirement.txt

Datasets

https://ai.stanford.edu/~jkrause/cars/car_dataset.html
3D Object Representations for Fine-Grained Categorization
Jonathan Krause, Michael Stark, Jia Deng, Li Fei-Fei
4th IEEE Workshop on 3D Representation and Recognition, at ICCV 2013 (3dRR-13). Sydney, Australia. Dec. 8, 2013.
[pdf] [BibTex] [slides]

196 classes
Trainset: 8144 images
Testset: 8041 images
Some images in training set:

Distribution of training set:

Min: 24 images/class , max: 68 images/class , mean: 41 images/class, so this dataset is quite balanced.

Quick download datasets via command line:
$ bash quick_download.sh
Cross-validation 5 folds
$ python prepare.py

Training

Using pre-trained weights on imagenet dataset, with transfer learning to train the model. All layers will be fine tuned and the last fully connected layer will be replaced entirely. Useful tricks I used for training:

  • Cyclical Learning Rate [paper] [repo]
  • Heavy augmentation: random crops, horizontal flip, rotate, shear, AddToHueAndSaturation, AddMultiply, GaussianBlur, ContrastNormalization, sharpen, emboss
  • Random eraser [paper]
  • Mixup [paper]
  • Cross-validation 5 folds

$ python train.py --network network --gpu gpu_id --epochs number_of_epochs --multiprocessing False/True
You can choose any network in list:

  • VGG16, VGG19
  • ResNet50, ResNet101, ResNet152, ResNet50V2, ResNet101V2, ResNet152V2, ResNeXt50, ResNeXt101
  • InceptionV3, InceptionResNetV2, Xception
  • MobileNet, MobileNetV2
  • DenseNet121, DenseNet169, DenseNet201
  • NASNetMobile, NASNetLarge
  • EfficientNetB0, EfficientNetB1, EfficientNetB2, EfficientNetB3, EfficientNetB4, EfficientNetB5, EfficientNetB6, EfficientNetB7

For example to train MobileNetV2 on 200 epochs:
$ python train.py --network MobileNetV2 --gpu 0 --epochs 200 --multiprocessing False

I used the optimal parameters (input size, batch_size) for my hardware (1x1080 Ti 12GB, RAM 32GB, CPU 12 Core), you can modify config.py to suit your hardware.

I saved training log of 13 models on each fold in logs

Checkpoint

Download checkpoint of 13 models in link then put into folder checkpoints to evaluate model, generate submission or demo on image.

Evaluate models:

To enhance the result, I applied 12 crops for validation and test prediction. Accuracy of single model is ensemble of 12 crops and 5 folds. For example with input shape of network is 224x224x3:

To evaluate network, run:
$ python evaluate.py --network network --gpu gpu_id --multi_crops True/False
For example:
$ python evaluate.py --network MobileNetV2 --gpu 0 --multi_crops True

To generate submission for each model, run:
$ python predict.py --network network --gpu gpu_id

Output is network.txt in folder submission and raw output network.npy in folder data

You can summit your result at stanford-cars evaluation server.

Accuracy and size of 13 models:

Ensemble multi-models

Final result 0.9462 is ensemble of some models with suitable ratios: result = sum(weight x model) / sum(weight).
$ python ensemble.py

I just tried a few cases, you can try with other ratios and other models to get higher accuracy than 0.9462.

Demo on image

$ python demo.py --network network --gpu gpu_id --image_path path --imshow True/False
For example:
$ python demo.py --network ResNeXt101 --gpu 0 --image_path images/samples/02381.jpg --imshow True

Please read the file docs/Solution_for_StanfordCars_classification.pdf for more information.