How to do transfer learning for MNIST dataset?
Asked Answered
U

2

11

I have been trying to use transfer learning for MNIST dataset using VGG/Inception. But both of these networks accept images of atleast 224x224x3 size. How can i rescale the 28x28x1 MNIST images to 224x224x3 to do transfer learing?

Unwashed answered 17/12, 2017 at 6:16 Comment(0)
T
5

A common way to do what you're asking is to simply resize the images to the desired resolution required for the input layer into the CNN. Because you've tagged your question with , keras has a preprocessing module that allows you to load in images and optionally specify the desired size you want to scale the image by. If you look at the actual source of the method: https://github.com/keras-team/keras/blob/master/keras/preprocessing/image.py#L321, it internally uses Pillow interpolation methods to rescale the image to the desired resolution.

In addition, because the MNIST digits are originally grayscale, you will need to replicate the single channel image into a multi-channel image so that it artificially becomes RGB. This means that the red, green and blue channels are all the same and is the MNIST grayscale counterpart. The load_img method has the additional flag called grayscale, and you can set that to False to load in the image as a RGB image.

Once you load these images in converted to RGB and rescaled, you can go ahead and perform Transfer Learning with VGG19. In fact, it has been done before. Consult this link here: https://www.analyticsvidhya.com/blog/2017/06/transfer-learning-the-art-of-fine-tuning-a-pre-trained-model/ and look at Section 6: Use the pre-trained model for identifying digits.

I'd like to give you fair warning that taking a 28 x 28 image and resizing to a 224 x 224 image will have severe interpolation artifacts. You would perform transfer learning on image data that would contain noise due to upsampling but that's what was done in the blog post I linked earlier. I would recommend you change the interpolation to something like bilinear or bicubic. The default is to use nearest neighbour, which is terrible for upsampling images.

YMMV, so try resizing the image to the desired size of the input layer as well as pad the image with three channels to make it RGB and see what happens.

Tribe answered 17/12, 2017 at 9:3 Comment(1)
Thanks a lot for the pointers. I tried something like this. pastebin.com/Gmcb97y8 And i got TypeError: 'Tensor' object does not support item assignmentUnwashed
O
1

This greatly depends on the model you wish to use. In case of VGGNet, you have to do rescaling of the input to the expected target size, because VGG network contains FC layer, which shape matches the image dimensions after certain number of downsamples. Note that convolutional layers can take any image size due to parameter sharing.

However, modern CNNs are following the trend of switching to all-convolutional and solve the problem of arbitrary transfer learning. If you choose this path, take one of the latest Inception models. In this case, out-of-the model model should be able to accept even small 28x28x1 images.

Osterhus answered 17/12, 2017 at 9:13 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.