def conv2d_bn(x, nb_filter, nb_row, nb_col,
border_mode='same', subsample=(1, 1),
name=None):
'''Utility function to apply conv + BN.
'''
x = Convolution2D(nb_filter, nb_row, nb_col,
subsample=subsample,
activation='relu',
border_mode=border_mode,
name=conv_name)(x)
x = BatchNormalization(axis=bn_axis, name=bn_name)(x)
return x
When I use official inception_v3 model in keras, I find that they use BatchNormalization after 'relu' nonlinearity as above code script.
But in the Batch Normalization paper, the authors said
we add the BN transform immediately before the nonlinearity, by normalizing x=Wu+b.
Then I view the implementation of inception in tensorflow which add BN immediately before the nonlinearity as they said. For more details in inception ops.py
I'm confused. Why do people use above style in Keras other than the following?
def conv2d_bn(x, nb_filter, nb_row, nb_col,
border_mode='same', subsample=(1, 1),
name=None):
'''Utility function to apply conv + BN.
'''
x = Convolution2D(nb_filter, nb_row, nb_col,
subsample=subsample,
border_mode=border_mode,
name=conv_name)(x)
x = BatchNormalization(axis=bn_axis, name=bn_name)(x)
x = Activation('relu')(x)
return x
In the Dense case:
x = Dense(1024, name='fc')(x)
x = BatchNormalization(axis=bn_axis, name=bn_name)(x)
x = Activation('relu')(x)