How do I get a TensorFlow/Keras model that takes images as input to serve predictions on Cloud ML Engine?
Asked Answered
R

2

8

There are multiple questions (examples: 1, 2, 3, 4, 5, 6, etc.) trying to address the question of how to handle image data when serving predictions for TensorFlow/Keras models in Cloud ML Engine.

Unfortunately, some of the answers are out-of-date and none of them comprehensively addresses the problem. The purpose of this post is to provide a comprehensive, up-to-date answer for future reference.

Rickyrico answered 19/7, 2018 at 22:33 Comment(0)
R
19

This answer is going to focus on Estimators, which are high-level APIs for writing TensorFlow code and currently the recommended way. In addition, Keras uses Estimators to export models for serving.

This answer is going to be divided into two parts:

  1. How to write the input_fn.
  2. Client code for sending requests once the model is deployed.

How to Write the input_fn

The exact details of your input_fn will depend on your unique requirements. For instance, you may do image decoding and resizing client side, you might use JPG vs. PNG, you may expect a specific size of image, you may have additional inputs besides images, etc. We will focus on a fairly general approach that accepts various image formats at a variety of sizes. Thus, the following generic code should be fairly easily to adapt to any of the more specific scenarios.

HEIGHT = 199
WIDTH = 199
CHANNELS = 1

def serving_input_receiver_fn():

  def decode_and_resize(image_str_tensor):
     """Decodes jpeg string, resizes it and returns a uint8 tensor."""
     image = tf.image.decode_jpeg(image_str_tensor, channels=CHANNELS)
     image = tf.expand_dims(image, 0)
     image = tf.image.resize_bilinear(
         image, [HEIGHT, WIDTH], align_corners=False)
     image = tf.squeeze(image, squeeze_dims=[0])
     image = tf.cast(image, dtype=tf.uint8)
     return image

 # Optional; currently necessary for batch prediction.
 key_input = tf.placeholder(tf.string, shape=[None]) 
 key_output = tf.identity(key_input)

 input_ph = tf.placeholder(tf.string, shape=[None], name='image_binary')
 images_tensor = tf.map_fn(
      decode_and_resize, input_ph, back_prop=False, dtype=tf.uint8)
 images_tensor = tf.image.convert_image_dtype(images_tensor, dtype=tf.float32) 

 return tf.estimator.export.ServingInputReceiver(
     {'images': images_tensor},
     {'bytes': input_ph})

If you've saved out Keras model and would like to convert it to a SavedModel, use the following:

KERAS_MODEL_PATH='/path/to/model'
MODEL_DIR='/path/to/store/checkpoints'
EXPORT_PATH='/path/to/store/savedmodel'

# If you are invoking this from your training code, use `keras_model=model` instead.
estimator = keras.estimator.model_to_estimator(
    keras_model_path=KERAS_MODEL_PATH,
    model_dir=MODEL_DIR)
estimator.export_savedmodel(
    EXPORT_PATH,
    serving_input_receiver_fn=serving_input_receiver_fn) 

Sending Requests (Client Code)

The body of the requests sent to service will look like the following:

{
  "instances": [
    {"bytes": {"b64": "<base64 encoded image>"}},  # image 1
    {"bytes": {"b64": "<base64 encoded image>"}}   # image 2 ...        
  ]
}

You can test your model / requests out locally before deploying to speed up the debugging process. For this, we'll use gcloud ml-engine local predict. However, before we do that, please note the gclouds data format is a slight transformation from the request body shown above. gcloud treats each line of the input file as an instance/image and then constructs the JSON from each line. So instead of the above request, we will instead have:

{"bytes": {"b64": "<base64 encoded image>"}}
{"bytes": {"b64": "<base64 encoded image>"}}

gcloud will transform this file into the request above. Here is some example Python code that can produce a file suitable for use with gcloud:

import base64
import sys

for filename in sys.argv[1:]:
  with open(filename, 'rb') as f:
    img_data = f.read()
    print('{"bytes": {"b64": "%s"}}' % (base64.b64encode(img_data),))

(Let's call this file to_instances.py)

To test the model with predictions:

python to_instances.py img1.jpg img2.jpg > instances.json
gcloud ml-engine local predict --model-dir /path/to/model --json-instances=instances.json

After we've finished debugging, we can deploy the model to the cloud using gcloud ml-engine models create and gcloud ml-engine versions create as described in the documentation.

At this point, you can use your desired client to send requests to your model on the service. Note, that this will require an authentication token. We'll examine a few examples in various languages. In each case, we'll assume your model is called my_model.

gcloud

This is pretty close to the same as local predict:

python to_instances.py img1.jpg img2.jpg > instances.json
gcloud ml-engine predict --model my_model --json-instances=instances.json    

curl

We'll need a script like to_instances.py to convert images; let's call it to_payload.py:

import base64
import json 
import sys

instances = []
for filename in sys.argv[1:]:
  with open(filename, 'rb') as f:
    img_data = f.read()
    instances.append(base64.b64encode(img_data))
print(json.dumps({"instances": instances}))

python to_request.py img1.jpg img2.jpg > payload.json

curl -m 180 -X POST -v -k -H "Content-Type: application/json" \ -d @payload.json \ -H "Authorization: Bearer gcloud auth print-access-token" \ https://ml.googleapis.com/v1/projects/${YOUR_PROJECT}/models/my_model:predict

Python

import base64
PROJECT = "my_project"
MODEL = "my_model"

img_data = ... # your client will have its own way to get image data.

# Create the ML Engine service object.
# To authenticate set the environment variable
# GOOGLE_APPLICATION_CREDENTIALS=<path_to_service_account_file>
service = googleapiclient.discovery.build('ml', 'v1')
name = 'projects/{}/models/{}'.format(PROJECT, MODEL)

response = service.projects().predict(
    name=name,
    body={'instances': [{'b64': base64.encode(img_data)}]}
).execute()

if 'error' in response:
    raise RuntimeError(response['error'])

return response['predictions']

Javascript/Java/C#

Sending requests in Javascript/Java/C# are covered elsewhere (Javascript, Java, C#, respectively) and those examples should be straightforward to adapt.

Rickyrico answered 19/7, 2018 at 22:33 Comment(5)
You are currently not handling the instance keys. Also the variable image_str_tensor seems to do nothing.Carnarvon
@pierrom It looks like this has been addressed.Rickyrico
Hi, thanks for your answer. Is there a way to use tensorflow.keras.applications model specific preprocess_input functions within serving_input_receiver_fn? Also does there happen to be a way to support both lists of np arrays and b64 at the same time?Sufferance
I apologize, but I'm not familiar with tensorflow.keras.applications. I recommend starting a new SO question. You should be able to send np arrays and b64 at the same time, but you may want to open another question for that, as well.Rickyrico
@Rickyrico I managed to get the model running in GCloud with this tutorial, but I am getting all predictions quite similar and always pointing to the same class as predicted, (if I run my model before converting it to estimator it gives proper predictions), any idea what might be the issue? I checked the b64 images and convert them back and they are fine, I am using a model that is based on resnet50 in case it adds come informatio. Thank youSmitty
R
2

The answer by @rhaertel above is the best treatment of this subject I've seen. For anyone working on deploying TensorFlow image-based models on Google Cloud ML, I'd recommend also having a look at the following repo:

https://github.com/mhwilder/tf-keras-gcloud-deployment.

I spent a while trying to get all of this working for several use cases and did my best to document the whole process in this repo. The repo covers the following topics:

  1. Training a fully convolutional tf.keras model locally (mostly just to have a model for testing the next parts)
  2. Example code for exporting models that work with the Cloud ML Engine
  3. Three model versions that accept different JSON input types (1. An image converted to a simple list string, 2. An image converted to a base64 encoded string, and 3. A URL that points to an image in a Google Storage bucket)
  4. Instructions and references for general Google Cloud Platform setup
  5. Code for preparing the input JSON files for the 3 different input types
  6. Google Cloud ML model and version creation instructions from the console
  7. Examples using the Google Cloud SDK to call predict on the models
Resistant answered 24/9, 2018 at 20:6 Comment(3)
Thanks, your repo saved me a lot of headache. One question: when I define the estimator with tf.keras.estimator.model_to_estimator, tensorflow stores all the files under a subdir /keras of model_dir (/models/tf in your example). Consequently, I get an error message when calling export_savedmodel saying that no files were found, and I need to move the files manually to get it to work. Dit you encounter the same issue?Brookins
@Brookins I didn't encounter this issue when I was experimenting with this. I'm not sure why it is different for you. Glad you were able to make it work though and that the repo helped you.Resistant
I documented the issue here, together with my workaround: #54616208 Perhaps I'm missing something obvious here.Brookins

© 2022 - 2024 — McMap. All rights reserved.