This project aims at recognizing the car make and model based on a Stanford Cars Dataset with 16,185 images. This dataset includes information about car make, model, and year (Eg. 2012 Tesla Model S) with 196 different classes. However, in this project we target to identify the car make and model only; this results in 164 different classes in total.
Approach
The transfer learning technique is used to produce the model for this dataset because of fairly small data size. Further than that, we are able to achieve a good accuracy with less training time. And the VGGNet is used as the network for this technique as its comparitive performance over other networks (Ref HERE).
The VGG CNN was first introduced by Simonyan and Zisserman in their 2014 paper, Very Deep Convolutional Networks for Large Scale Image Recognition The network achitechture is comprised of 13 convolution and 3 FC layers as below.
Result
The main effort paying for transfer learning technique is to do fine-tuning the network to get better result (higher accuracy with lower loss). With number of experiments, the final model is selected at epoch 85 with the learning rate of 5e-5 for epoch 1->50 & 1e-5 from 51->85. This selection is to lower the chance of overfitting when accuracy reached ~85% rank1 accuracy.
The result obtained for the test set is 85.74% for rank1 and 96.54% for rank5.
Usage
Refer details descibed on below repository:
Comments