Python conversion of PIL image to numpy array very slow

Asked 22/9, 2018 at 4:18 Answered 7/9, 2021 at 6:51

Solved python numpy opencv tensorflow computer-vision

I am evaluating a Tensorflow model on open cv video frames. I need to reshape the incoming PIL image into reshaped numpy array so that i can run inference on it. But i see that the conversion of the PIL image to numpy array is taking around 900+ milliseconds on my laptop with 16 GiB memory and 2.6 GHz Intel Core i7 processor. I need to get this down to a few milliseconds so that i can process multiple frames per second on my camera.

Can anyone suggest how to make the below method run faster?

def load_image_into_numpy_array(pil_image):
    (im_width, im_height) = pil_image.size
    data = pil_image.getdata()

    data_array = np.array(data)

    return data_array.reshape((im_height, im_width, 3)).astype(np.uint8)

On further instrumentation i realized that np.array(data) is taking the bulk of the time... close to 900+ milliseconds. So conversion of the image data to numpy array is the real culprit.

Forecast answered 22/9, 2018 at 4:18 Comment(4)

At which step you are getting a PIL image? – Stickseed 22/9, 2018 at 4:24

Am getting the PIL image earlier and passing it to this function. – Forecast 22/9, 2018 at 4:31

How big is the image? – Snug 22/9, 2018 at 4:42

(720, 1280, 3) – Forecast 22/9, 2018 at 8:22

You can just let numpy handle the conversion instead of reshaping yourself.

def pil_image_to_numpy_array(pil_image):
    return np.asarray(pil_image)

You are converting image into (height, width, channel) format. That is default conversion numpy.asarray function performs on PIL image so explicit reshaping should not be neccesary.

Stickseed answered 22/9, 2018 at 4:38 Comment(6)

Thanks for the solution. The reshaping is needed so that i can get the image in the right shape for inference... Check load_image_into_numpy_array method in github.com/tensorflow/models/blob/master/research/… – Forecast 22/9, 2018 at 4:42

I am not sure why the explicit conversion is neccesary, you are converting image into (height, width, channel) format. When you use default numpy method to convert PIL image into numpy array you already get your numpy array in (height, width, channel). Sorry if I am missing something here. – Stickseed 22/9, 2018 at 4:48

Oh okay... thanks for the information. I was simply following what tensorflow folks are doing... Btw np.asarray(pil_image) is super fast... takes only 1ms... – Forecast 22/9, 2018 at 4:49

Interestingly when i print pil_image.size i get (1280, 720), but when i print np.asarray(pil_image).size i get (720, 1280, 3) .. I wonder if i got the height and width reversed which could lead to incorrect results – Forecast 22/9, 2018 at 5:4

PIL uses (width, height) order pillow.readthedocs.io/en/3.1.x/reference/Image.html#attributes, whereas numpy uses (height, width) as mentioned here #43273348. – Stickseed 22/9, 2018 at 5:21

About 50x faster than whatever I was doing before! – Nutwood 29/12, 2020 at 18:41

Thank you very much!! It works very fast!

def load_image_into_numpy_array(path):
    """Load an image from file into a numpy array.

    Puts image into numpy array to feed into tensorflow graph.
    Note that by convention we put it into a numpy array with shape
    (height, width, channels), where channels=3 for RGB.

    Args:
    path: a file path (this can be local or on colossus)

    Returns:
    uint8 numpy array with shape (img_height, img_width, 3)
    """
    img_data = tf.io.gfile.GFile(path, 'rb').read()
    image = Image.open(BytesIO(img_data))

    return np.array(image)

Image with (3684, 4912, 3) take 0.3~0.4 sec.

Jaxartes answered 7/9, 2021 at 6:51 Comment(0)

Recommended topics

Hot tags