Do Imagenet Models transfer well across datasets?

Published: December 25, 2019

arxiv url : http://arxiv.org/abs/1805.08974

A lot of work in computer vision today depends on general understanding that if the model is pre-trained on a large dataset like ImageNet, it will perform better on vision tasks like image classification, object detection, segmentation etc.

This work from Simon et al. 2018 focuses on solving this puzzle of verifying and evaluating if transfer learning from imagenet works well for different dataset.

Pre-trained Model Usability

Pre-trained models can be used in following two settings:

Fixed-feature extractor: Model is initialized with parameters from model trained over imagenet dataset. Most of the layers of model are frozen and final few layers are trained on the new dataset.
Fine-tuned: Similar to previous, model parameters are initilized by parameters values trained over imagenet. Instead of freezing a layer, all parameters are again updated for new dataset.

Another way to initialize model parameter is to choose the values randomly. Then update the parameter values according to new dataset.

Dataset and Model Architectures Used:

Following are the various datasets used to test transfer learning :

Food101
Cifar-10
Cifar-100
Birdsnap
SUN397
Stanford Cars
FGVC Aircraft
Pascal VOC 2007 Cls.
Describable Textures (DTD)
Oxford-IIIT Pets
Caltech-101
Oxford 102 Flowers

The variability of dataset is in terms of number of sample per category, object types, image size etc.

Network Architectures used are :

Inception
BN-Inception
Resnet (50, 101, 152)
Densenet (121, 169, 201)
Mobilenet v1,v2
NASNet-A mobile
NASNet-A Large

Task Settings

For each of the network and dataset, these 3 tasks were tested :

learn a logistic regression model based on penultimate layer features using imagenet pretrained model
initialize layer weights from Imagenet pretrained model and fine-tune all.
From random initialization of model weights, train it over new dataset.

Conclusion

Final conclusions from Simon et al.

Performance of models on imagenet dataset can provide better features in final layers that leads to better classifiers if these features are used.
In case of Fixed-feature extractor, the use of regularizor on models during imagenet training will have relatively poor performance when trained on
For large datasets using model with random initialization, the performance on new dataset is similar to model’s performance on Imagenet. Even though pre-trained weights are not used.

Share on

Twitter Facebook Google+ LinkedIn

Abhinav Dadhich

Pre-trained Model Usability

Dataset and Model Architectures Used:

Task Settings

Conclusion

Share on

Leave a Comment

You May Also Enjoy

Training with AdvProp

Self Supervised Learning for Vision

Pytorch Tutorial

Tensorflow-Densenet