How to convert int32 numpy array into int16 numpy array?

Asked 2/4, 2019 at 11:46 Answered 8/12, 2022 at 11:55

I want to conert a numpy array from int32 type to int16 type.

I have an int32 array called array_int32 and I am converting that to int16.

import numpy as np
array_int32 = np.array([31784960, 69074944, 165871616])`
array_int16 = array_int32.astype(np.int16)

After conversion, the array_int16 turns into an array of zeros. I don't know what mistake I am doing. Could anyone help me in this?

Fabria answered 2/4, 2019 at 11:46 Comment(2)

Your first array is int64. What values were you hoping to find in your second array given that a 16-bit integer can only manage up to 65,535 and all your entries exceed that? – Lenz 2/4, 2019 at 12:5

@MarkSetchell I am trying to convert an audio with different bitdepths. – Fabria 2/4, 2019 at 12:16

You could discard the bottom 16 bits:

n=(array_int32>>16).astype(np.int16)

which will give you this:

array([ 485, 1054, 2531], dtype=int16

Lenz answered 2/4, 2019 at 13:17 Comment(0)

The numbers in your array_int32 are too large to be represented with 16 bits (a signed integer value with 16 bits can only represent a maximum value of 2^16-1=32767). Apparently, numpy just sets the resulting numbers to zero in this case.

This behavior can be modified by changing the optional casting argument of astype The documentation states

Starting in NumPy 1.9, astype method now returns an error if the string dtype to cast to is not long enough in ‘safe’ casting mode to hold the max value of integer/float array that is being casted. Previously the casting was allowed even if the result was truncated.

So, an additional requirement casting='safe' will result in a TypeError, as the conversion from 32 (or 64) bits downto 16, as the maximum value of original type is too large for the new type, e.g.

import numpy as np
array_int32 = np.array([31784960, 69074944, 165871616])
array_int16 = array_int32.astype(np.int16, casting='safe')

results in

TypeError: Cannot cast array from dtype('int64') to dtype('int16') according to the rule 'safe'

Attenuant answered 2/4, 2019 at 12:5 Comment(2)

However in my computer the array is shown as int32. Anyways, aren't there any solution to do this conversion? – Fabria 2/4, 2019 at 12:13

As your numbers cannot be represented using 16 bits, there is no meaningful way to do this conversion. – Attenuant 3/4, 2019 at 14:33

As has been pointed out by the user jdamp the numbers are too large to be represented as 16 bit integer values. I don't know the context of your question, but it may be useful to know that a simple rescaling of the numbers can be done.

import math
import numpy as np

def scale_to(x, x_min, x_max, t_min, t_max):
    """
    Scales x to lie between t_min and t_max
    Links:
         https://stats.stackexchange.com/questions/281162/scale-a-number-between-a-range
         https://stats.stackexchange.com/questions/178626/how-to-normalize-data-between-1-and-1
    """
    r = x_max - x_min
    r_t = t_max - t_min
    assert(math.isclose(0,r, abs_tol=np.finfo(float).eps) == False)
    x_s = r_t * (x - x_min) / r + t_min
    return x_s

A conversion of these rather large values into a 16 bit format would then look like this:

array_float = np.array([31784960.12, 69074944.12, 165871616.34])
scaled_array = scale_to(array_float,np.min(array_float),np.max(array_float), -32768,32767)
array_int16 = scaled_array.astype(np.int16)

The values -32768 and 32767 are the largest and smallest value that can be represented by 16 bit. These values represent the min and max value of your input array. All other values are scaled in between. Only then as a final step the type casting is done. So, the resulting output for the values above will look like this:

array_int16

array([-32768, -14542, 32767], dtype=int16)

Please note, that I changed the input to floating point values just to show this can also be done with float values too.

The numbers can be scaled back to nearly their original value if we remember the min and max values of the original array.

def scale_inv(x_s, x_min, x_max, t_min, t_max):
    """
    Inverse scaling
    Links:
        https://stats.stackexchange.com/questions/281162/scale-a-number-between-a-range
        https://stats.stackexchange.com/questions/178626/how-to-normalize-data-between-1-and-1
    """
    r = x_max - x_min
    r_t = t_max - t_min
    assert(math.isclose(0,r_t, abs_tol=np.finfo(float).eps) == False)
    x = (x_s - t_min) * r / r_t + x_min
    return x

inv = scale_inv(array_int16.astype(float), np.min(array_float), np.max(array_float), -32768.0, 32767.0)

The last line gives us back the original values with some round-off errors:

array([3.17849601e+07, 6.90759252e+07, 1.65871616e+08])

The original values were: 31784960.12, 69074944.12, 165871616.34 (as seen above in the code)

This maybe useful for example in audio file conversions. Depending on your context this maybe helpful. (If a simple rescaling with type casting is not what you are looking for, then maybe you need to look at resampling)

Keep in mind though, some loss of information is always involved and unavoidable for the resaon given by jdamp. As an analogy: you usually cannot squeeze the contents of a full large box into a smaller box.

P.S.: For scaling see in particular this link on stack exchange: min-max-scaler Another link is given in the comments of the code.

Rattlebrain answered 8/12, 2022 at 11:55 Comment(0)

Recommended topics

Hot tags