How to preprocess training set for VGG16 fine tuning in Keras?
Asked Answered
D

1

6

I have fine tuned the Keras VGG16 model, but I'm unsure about the preprocessing during the training phase.

I create a train generator as follow:

train_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
        train_folder,
        target_size=(IMAGE_SIZE, IMAGE_SIZE),
        batch_size=train_batchsize,
        class_mode="categorical"
    )

Is the rescale enough or I have to apply others preprocessing functions?

When I use the network to classify an image I use this code:

from keras.models import load_model
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input
import numpy as np

img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
preds = model.predict(x)

I think this is the correct preprocess and I should apply it before training.

Thanks for your help.

Delwyn answered 29/1, 2019 at 18:34 Comment(0)
D
11

ImageDataGenerator has a preprocessing_function argument which allows you to pass the same preprocess_input function that you are using during inference. This function will do the rescaling for you, so can omit the scaling:

from keras.applications.vgg16 import preprocess_input
train_datagen = ImageDataGenerator(preprocessing_function=preprocess_input)

Most of the pretrained models in keras_applications use the same preprocessing function. You can inspect the docstring to see what it does:

def preprocess_input(x, data_format=None, mode='caffe', **kwargs):
    """Preprocesses a tensor or Numpy array encoding a batch of images.
    # Arguments
        x: Input Numpy or symbolic tensor, 3D or 4D.
            The preprocessed data is written over the input data
            if the data types are compatible. To avoid this
            behaviour, `numpy.copy(x)` can be used.
        data_format: Data format of the image tensor/array.
        mode: One of "caffe", "tf" or "torch".
            - caffe: will convert the images from RGB to BGR,
                then will zero-center each color channel with
                respect to the ImageNet dataset,
                without scaling.
            - tf: will scale pixels between -1 and 1,
                sample-wise.
            - torch: will scale pixels between 0 and 1 and then
                will normalize each channel with respect to the
                ImageNet dataset.
    # Returns
        Preprocessed tensor or Numpy array.
Diorama answered 29/1, 2019 at 18:44 Comment(3)
Thank you for the answer. So the functions img_to_array and expand_dims are useless for the preprocess of the training set?Delwyn
The ImageDataGenerator will use those functions under the hood to turn the images into numpy arrays and organize them in batches. So you don't need to do that yourself during training, indeed.Diorama
so for tf.keras.applications.vgg16.VGG16, which preprocessing mode should we use: 'caffe' or 'tf'?Louise

© 2022 - 2024 — McMap. All rights reserved.