Keras VGG16 preprocess_input modes
Asked Answered
D

3

13

I'm using the Keras VGG16 model.

I've seen it there is a preprocess_input method to use in conjunction with the VGG16 model. This method appears to call the preprocess_input method in imagenet_utils.py which (depending on the case) calls _preprocess_numpy_input method in imagenet_utils.py.

The preprocess_input has a mode argument which expects "caffe", "tf", or "torch". If I'm using the model in Keras with TensorFlow backend, should I absolutely use mode="tf"?

If yes, is this because the VGG16 model loaded by Keras was trained with images which underwent the same preprocessing (i.e. changed input image's range from [0,255] to input range [-1,1])?

Also, should the input images for testing mode also undergo this preprocessing? I'm confident the answer to the last question is yes, but I would like some reassurance.

I would expect Francois Chollet to have done it correctly, but looking at https://github.com/fchollet/deep-learning-models/blob/master/vgg16.py either he is or I am wrong about using mode="tf".

Updated info

@FalconUA directed me to the VGG at Oxford which has a Models section with links for the 16-layer model. The information about the preprocessing_input mode argument tf scaling to -1 to 1 and caffe subtracting some mean values is found by following the link in the Models 16-layer model: information page. In the Description section it says:

"In the paper, the model is denoted as the configuration D trained with scale jittering. The input images should be zero-centered by mean pixel (rather than mean image) subtraction. Namely, the following BGR values should be subtracted: [103.939, 116.779, 123.68]."

Dedie answered 31/10, 2018 at 22:45 Comment(0)
G
18

The mode here is not about the backend, but rather about on what framework the model was trained on and ported from. In the keras link to VGG16, it is stated that:

These weights are ported from the ones released by VGG at Oxford

So the VGG16 and VGG19 models were trained in Caffe and ported to TensorFlow, hence mode == 'caffe' here (range from 0 to 255 and then extract the mean [103.939, 116.779, 123.68]).

Newer networks, like MobileNet and ShuffleNet were trained on TensorFlow, so mode is 'tf' for them and the inputs are zero-centered in the range from -1 to 1.

Gonadotropin answered 1/11, 2018 at 1:51 Comment(3)
Thanks! That's the info I was looking for. I guess I should have been more diligent in digging through the links. So, is it always the case that models trained on TensorFlow have inputs zero-centered in range from -1 to 1? If, yes, do you know why? Similarly, is it always the case that models trained in caffe are in range 0 to 255 and then extract the mean? I'm trying to figure out if these just happen to be how the models keras uses were trained or if this is suggested/recommended practice for the respective frameworks.Dedie
@Dedie well, it is not always the case, it is just... a historical thing. You see, Caffe was the most popular in 2012-2016, so everything was trained on Caffe back then, and since BatchNormalization was not a thing at that time, it is common to see -127.5 ... 127.5 values (VGG is indeed a very old network).Gonadotropin
@Dedie The newer architectures were using BatchNorm and other stuffs, and they found that it is generally good to keep the outputs around -1 .. 1, so the newer stuffs, like Xception and MobileNet were using that kind of normalization, and they were trained on the most popular frameworks, which are TensorFlow and PyTorch.Gonadotropin
D
6

In my experience in training VGG16 in Keras, the inputs should be from 0 to 255, subtracting the mean [103.939, 116.779, 123.68]. I've tried transfer learning (freezing the bottom and stack a classifier on top) with inputs centering from -1 to 1, and the results are much worse than 0..255 - [103.939, 116.779, 123.68].

Dire answered 1/11, 2018 at 1:55 Comment(0)
E
1

Trying to use VGG16 myself again lately, i had troubles getting descent results by just importing preprocess_input from vgg16 like this:

from keras.applications.vgg16 import VGG16, preprocess_input

Doing so, preprocess_input by default is set to 'caffe' mode but having a closer look at keras vgg16 code, i noticed that weights name

'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels.h5'

is referring to tensorflow twice. I think that preprocess mode should be 'tf'.

processed_img = preprocess_input(img, mode='tf')
Erin answered 1/2, 2019 at 16:34 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.