Training quantized models in TensorFlow

I would like to train a quantized network, i.e. use quantized weights during the forward pass to calculate the loss and then update the underlying full-precision floating point weights during the backward pass.

Note that in my case "fake quantization" is sufficient. That means that the weights can still be stored as 32-bit floating point values, as long as they represent a low bitwidth quantized value.

In a blog post from Pete Warden he states:

[...] we do have support for “fake quantization” operators. If you include these in your graphs at the points where quantization is expected to occur (for example after convolutions), then in the forward pass the float values will be rounded to the specified number of levels (typically 256) to simulate the effects of quantization.

The mentioned operators can be found in the TensorFlow API.

Can anybody point out to me how to use these functions? If I call them after e.g. a conv layer in my model definition, why would this quantize the weights in the layer instead of the outputs (activations) of this layer?

Recommended topics

Hot tags