How to pass base64 encoded image to Tensorflow prediction?

Asked 22/12, 2017 at 14:30 Answered 10/1, 2019 at 9:30

I have a google-cloud-ml model that I can run prediction by passing a 3 dimensional array of float32...

{ 'instances' [ { 'input' : '[ [ [ 0.0 ], [ 0.5 ], [ 0.8 ] ] ... ] ]' } ] }

However this is not an efficient format to transmit images, so I'd like to pass base64 encoded png or jpeg. This document talks about doing that, but what is not clear is what the entire json object looks like. Does the { 'b64' : 'x0welkja...' } go in place of the '[ [ [ 0.0 ], [ 0.5 ], [ 0.8 ] ] ... ] ]', leaving the enclosing 'instances' and 'input' the same? Or some other structure? Or does the tensorflow model have to be trained on base64?

Palmieri answered 22/12, 2017 at 14:30 Comment(0)

The TensorFlow model does not have to be trained on base64 data. Leave your training graph as is. However, when exporting the model, you'll need to export a model that can accept PNG or jpeg (or possibly raw, if it's small) data. Then, when you export the model, you'll need to be sure to use a name for the output that ends in _bytes. This signals to CloudML Engine that you will be sending base64 encoded data. Putting it all together would like something like this:

from tensorflow.contrib.saved_model.python.saved_model import utils

# Shape of [None] means we can have a batch of images.
image = tf.placeholder(shape = [None], dtype = tf.string)
# Decode the image.
decoded = tf.image.decode_jpeg(image, channels=3)
# Do the rest of the processing.
scores = build_model(decoded)

# The input name needs to have "_bytes" suffix.
inputs = { 'image_bytes': image }
outputs = { 'scores': scores }
utils.simple_save(session, export_dir, inputs, outputs)

The request you send will look something like this:

{
    "instances": [{
        "b64": "x0welkja..."
    }]
}

Cornhusk answered 22/12, 2017 at 15:56 Comment(4)

Using a Keras model and hosting on GML. I can decode bytes and pass to the model for prediction in python easily, but when using GML for inference, how to 'interject' to do decoding? Do you add a layer to the model at the beginning? Do you use a lambda function? I'm confused by your code sending the result of 'build_model' (presumably a function calling tf's SavedModelBuilder) passing that to 'scores' output value? I found a similar question here. – Palmieri 26/12, 2017 at 21:24

@Cornhusk I'm following the similar method for my model but I'm unable to figure out modifying the input tensor. Here is the question. Any ideas? – Doubtless 6/7, 2018 at 6:23

What if the model has two images as input?? for example : {'feature_1' : "iVBORw0KGgoAAAAN...", 'feature_2' : "iVBORw0KGgoAAAAN....."} – Cholon 23/8, 2018 at 9:8

@AvinashRai the request would be {"instances": [{'feature_1' : "iVBORw0KGgoAAAAN...", "feature_2" : "iVBORw0KGgoAAAAN....."}]} – Cornhusk 23/8, 2018 at 14:17

If you just want an efficient way to send images to a model (and not necessarily base-64 encode it), I would suggest uploading your images(s) to Google Cloud Storage and then having your model read off GCS. This way, you are not limited by image size and you can take advantage of multi-part, multithreaded, resumable uploads etc. that the GCS API provides.

TensorFlow's tf.read_file will directly off GCS. Here's an example of a serving input_fn that will do this. Your request to CMLE would send it an image URL (gs://bucket/some/path/to/image.jpg)

def read_and_preprocess(filename, augment=False):
    # decode the image file starting from the filename
    # end up with pixel values that are in the -1, 1 range
    image_contents = tf.read_file(filename)
    image = tf.image.decode_jpeg(image_contents, channels=NUM_CHANNELS)
    image = tf.image.convert_image_dtype(image, dtype=tf.float32) # 0-1
    image = tf.expand_dims(image, 0) # resize_bilinear needs batches
    image = tf.image.resize_bilinear(image, [HEIGHT, WIDTH], align_corners=False)
    #image = tf.image.per_image_whitening(image)  # useful if mean not important
    image = tf.subtract(image, 0.5)
    image = tf.multiply(image, 2.0) # -1 to 1
    return image

def serving_input_fn():
    inputs = {'imageurl': tf.placeholder(tf.string, shape=())}
    filename = tf.squeeze(inputs['imageurl']) # make it a scalar
    image = read_and_preprocess(filename)
    # make the outer dimension unknown (and not 1)
    image = tf.placeholder_with_default(image, shape=[None, HEIGHT, WIDTH, NUM_CHANNELS])

    features = {'image' : image}
    return tf.estimator.export.ServingInputReceiver(features, inputs)

Your training code will train off actual images, just as in rhaertel80's suggestion above. See https://github.com/GoogleCloudPlatform/training-data-analyst/blob/master/courses/machine_learning/deepdive/08_image/flowersmodel/trainer/task.py#L27 for what the training/evaluation input functions would look like.

Dormer answered 23/12, 2017 at 18:47 Comment(1)

I'm following the similar method for my model but I'm unable to figure out modifying the input tensor. Here is the question. Any ideas? – Doubtless 8/7, 2018 at 12:12

I was trying to use @Lak's answer (thanks Lak) to get online predictions for multiple instances in one json file, but kept getting the following error (I had two instances in my test json, hence the shape [2]):

input filename tensor must be scalar but had shape [2]

The problem is that ML engine apparently batches all the instances together and passes them to the serving inpur receiver function, but @Lak's sample code assumes the input is a single instance (it indeed works fine if you have a single instance in your json). I altered the code so that it can process a batch of inputs. I hope it will help someone:

def read_and_preprocess(filename):
    image_contents = tf.read_file(filename)
    image = tf.image.decode_image(image_contents, channels=NUM_CHANNELS)
    image = tf.image.convert_image_dtype(image, dtype=tf.float32) # 0-1
    return image

def serving_input_fn():
    inputs = {'imageurl': tf.placeholder(tf.string, shape=(None))}
    filename = inputs['imageurl']
    image = tf.map_fn(read_and_preprocess, filename, dtype=tf.float32)
    # make the outer dimension unknown (and not 1)
    image = tf.placeholder_with_default(image, shape=[None, HEIGHT, WIDTH, NUM_CHANNELS])

    features = {'image': image}
    return tf.estimator.export.ServingInputReceiver(features, inputs)

The key changes are that 1) you don't squeeze the input tensor (that would cause trouble in the special case when your json contains only one instance) and, 2) use tf.map_fn to apply the read_and_preprocess function to a batch of input image urls.

Heterodyne answered 10/1, 2019 at 9:30 Comment(1)

Cool. I would suggest that you use tf.contrib.data.map_and_batch which will let you avoid overwhelming the service -- batch the inputs you get into 20 or so, and if you are serving on a machine with multiple cores, you can specify the number of parallel batches. – Dormer 11/1, 2019 at 22:58

Recommended topics

Hot tags