Quantization aware training in TensorFlow version 2 and BatchNorm folding

Asked 27/3, 2020 at 10:13 Answered 24/8, 2020 at 11:15

python tensorflow tensorflow2.0 batch-normalization quantization-aware-training

I'm wondering what the current available options are for simulating BatchNorm folding during quantization aware training in Tensorflow 2. Tensorflow 1 has the tf.contrib.quantize.create_training_graph function which inserts FakeQuantization layers into the graph and takes care of simulating batch normalization folding (according to this white paper).

Tensorflow 2 has a tutorial on how to use quantization in their recently adopted tf.keras API, but they don't mention anything about batch normalization. I tried the following simple example with a BatchNorm layer:

import tensorflow_model_optimization as tfmo

model = tf.keras.Sequential([
      l.Conv2D(32, 5, padding='same', activation='relu', input_shape=input_shape),
      l.MaxPooling2D((2, 2), (2, 2), padding='same'),
      l.Conv2D(64, 5, padding='same', activation='relu'),
      l.BatchNormalization(),    # BN!
      l.MaxPooling2D((2, 2), (2, 2), padding='same'),
      l.Flatten(),
      l.Dense(1024, activation='relu'),
      l.Dropout(0.4),
      l.Dense(num_classes),
      l.Softmax(),
])
model = tfmo.quantization.keras.quantize_model(model)

It however gives the following exception:

RuntimeError: Layer batch_normalization:<class 'tensorflow.python.keras.layers.normalization.BatchNormalization'> is not supported. You can quantize this layer by passing a `tfmot.quantization.keras.QuantizeConfig` instance to the `quantize_annotate_layer` API.

which indicates that TF does not know what to do with it.

I also saw this related topic where they apply tf.contrib.quantize.create_training_graph on a keras constructed model. They however don't use BatchNorm layers, so I'm not sure this will work.

So what are the options for using this BatchNorm folding feature in TF2? Can this be done from the keras API, or should I switch back to the TensorFlow 1 API and define a graph the old way?

Batrachian answered 27/3, 2020 at 10:13 Comment(0)

If you add BatchNormalization before activation, you would not have issues with Quantization. Note: Quantization is supported in BatchNormalization only if it the layer is exactly after Conv2D layer. https://www.tensorflow.org/model_optimization/guide/quantization/training

# Change
l.Conv2D(64, 5, padding='same', activation='relu'),
l.BatchNormalization(),    # BN!
# with this
l.Conv2D(64, 5, padding='same'),
l.BatchNormalization(),
l.Activation('relu'),

#Other way of declaring the same
o = (Conv2D(512, (3, 3), padding='valid' , data_format=IMAGE_ORDERING))(o)
o = (BatchNormalization())(o)
o = Activation('relu')(o)

Chromate answered 24/4, 2020 at 22:51 Comment(0)

You should apply the quantization annotation as in the instruction. I think you can call the BatchNorm now like this:

class DefaultBNQuantizeConfig(tfmot.quantization.keras.QuantizeConfig):

def get_weights_and_quantizers(self, layer):
    return []

def get_activations_and_quantizers(self, layer):
    return []

def set_quantize_weights(self, layer, quantize_weights):
    pass
def set_quantize_activations(self, layer, quantize_activations):
    pass
def get_output_quantizers(self, layer):
    return [tfmot.quantization.keras.quantizers.MovingAverageQuantizer(
    num_bits=8, per_axis=False, symmetric=False, narrow_range=False)]

def get_config(self):
    return {}

If you still want to quantize for the layer, change the return of the get_weights_and_quantizers to return [(layer.weights[i], LastValueQuantizer(num_bits=8, symmetric=True, narrow_range=False, per_axis=False)) for i in range(2)]. Then set back the quantizers to gamma,beta,... according to the indices of the return list above at set_quantize_weights. However, I am not encouraging this way as it surely harm the accuracy as BN should act as an activation quantization

The result you have would be like this (RESNET50):

Tectonics answered 24/8, 2020 at 11:15 Comment(0)

Recommended topics

Hot tags