What are the Tensorflow qint8, quint8, qint32, qint16, and quint16 datatypes?

These are the data types of the output Tensor of the function, tf.quantization.quantize(). This corresponds to the Argument, T of the function.

Mentioned below is the underlying code, which converts/quantizes a Tensor from one Data Type (e.g. float32) to another (tf.qint8, tf.quint8, tf.qint32, tf.qint16, tf.quint16).

out[i] = (in[i] - min_range) * range(T) / (max_range - min_range)
if T == qint8: out[i] -= (range(T) + 1) / 2.0

Then, they can be passed to functions like tf.nn.quantized_conv2d, etc.., whose input is a Quantized Tensor, explained above.

TLDR, to answer your question in short, they are actually stored 8 bits (for qint8) in memory.

You can find more information about this topic in the below links:

https://www.tensorflow.org/api_docs/python/tf/quantization/quantize

https://www.tensorflow.org/api_docs/python/tf/nn/quantized_conv2d

https://www.tensorflow.org/lite/performance/post_training_quantization

If you feel this answer is useful, kindly accept this answer and/or up vote it. Thanks.

Recommended topics

Hot tags