Deploying Keras model to Google Cloud ML for serving predictions
Asked Answered
L

1

0

I need to understand how to deploy models on Google Cloud ML. My first task is to deploy a very simple text classifier on the service. I do it in the following steps (could perhaps be shortened to fewer steps, if so, feel free to let me know):

  1. Define the model using Keras and export to YAML
  2. Load up YAML and export as a Tensorflow SavedModel
  3. Upload model to Google Cloud Storage
  4. Deploy model from storage to Google Cloud ML
  5. Set the upload model version as default on the models website.
  6. Run model with a sample input

I've finally made step 1-5 work, but now I get this strange error seen below when running the model. Can anyone help? Details on the steps is below. Hopefully, it can also help others that are stuck on one of the previous steps. My model works fine locally.

I've seen Deploying Keras Models via Google Cloud ML and Export a basic Tensorflow model to Google Cloud ML, but they seem to be stuck on other steps of the process.

Error

Prediction failed: Exception during model execution: AbortionError(code=StatusCode.INVALID_ARGUMENT, details="In[0] is not a matrix
         [[Node: MatMul = MatMul[T=DT_FLOAT, _output_shapes=[[-1,64]], transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/cpu:0"](Mean, softmax_W/read)]]")

Step 1

# import necessary classes from Keras..
model_input = Input(shape=(maxlen,), dtype='int32')
embed = Embedding(input_dim=nb_tokens,
                  output_dim=256,
                  mask_zero=False,
                  input_length=maxlen,
                  name='embedding')
x = embed(model_input)
x = GlobalAveragePooling1D()(x)
outputs = [Dense(nb_classes, activation='softmax', name='softmax')(x)]
model = Model(input=[model_input], output=outputs, name="fasttext")
# export to YAML..

Step 2

from __future__ import print_function

import sys
import os

import tensorflow as tf
from tensorflow.contrib.session_bundle import exporter
import keras
from keras import backend as K
from keras.models import model_from_config, model_from_yaml
from optparse import OptionParser

EXPORT_VERSION = 1 # for us to keep track of different model versions (integer)

def export_model(model_def, model_weights, export_path):

    with tf.Session() as sess:
        init_op = tf.global_variables_initializer()
        sess.run(init_op)

        K.set_learning_phase(0)  # all new operations will be in test mode from now on

        yaml_file = open(model_def, 'r')
        yaml_string = yaml_file.read()
        yaml_file.close()

        model = model_from_yaml(yaml_string)

        # force initialization
        model.compile(loss='categorical_crossentropy',
                      optimizer='adam') 
        Wsave = model.get_weights()
        model.set_weights(Wsave)

        # weights are not loaded as I'm just testing, not really deploying
        # model.load_weights(model_weights)   

        print(model.input)
        print(model.output)

        pred_node_names = output_node_names = 'Softmax:0'
        num_output = 1

        export_path_base = export_path
        export_path = os.path.join(
            tf.compat.as_bytes(export_path_base),
            tf.compat.as_bytes('initial'))
        builder = tf.saved_model.builder.SavedModelBuilder(export_path)

        # Build the signature_def_map.
        x = model.input
        y = model.output

        values, indices = tf.nn.top_k(y, 5)
        table = tf.contrib.lookup.index_to_string_table_from_tensor(tf.constant([str(i) for i in xrange(5)]))
        prediction_classes = table.lookup(tf.to_int64(indices))

        classification_inputs = tf.saved_model.utils.build_tensor_info(model.input)
        classification_outputs_classes = tf.saved_model.utils.build_tensor_info(prediction_classes)
        classification_outputs_scores = tf.saved_model.utils.build_tensor_info(values)
        classification_signature = (
        tf.saved_model.signature_def_utils.build_signature_def(inputs={tf.saved_model.signature_constants.CLASSIFY_INPUTS: classification_inputs},
          outputs={tf.saved_model.signature_constants.CLASSIFY_OUTPUT_CLASSES: classification_outputs_classes, tf.saved_model.signature_constants.CLASSIFY_OUTPUT_SCORES: classification_outputs_scores},
          method_name=tf.saved_model.signature_constants.CLASSIFY_METHOD_NAME))

        tensor_info_x = tf.saved_model.utils.build_tensor_info(x)
        tensor_info_y = tf.saved_model.utils.build_tensor_info(y)

        prediction_signature = (tf.saved_model.signature_def_utils.build_signature_def(
            inputs={'images': tensor_info_x},
            outputs={'scores': tensor_info_y},
            method_name=tf.saved_model.signature_constants.PREDICT_METHOD_NAME))

        legacy_init_op = tf.group(tf.tables_initializer(), name='legacy_init_op')
        builder.add_meta_graph_and_variables(
            sess, [tf.saved_model.tag_constants.SERVING],
            signature_def_map={'predict_images': prediction_signature,
               tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: classification_signature,},
            legacy_init_op=legacy_init_op)

        builder.save()
        print('Done exporting!')

        raise SystemExit

if __name__ == '__main__':
    usage = "usage: %prog [options] arg"
    parser = OptionParser(usage)
    (options, args) = parser.parse_args()

    if len(args) < 3:   
        raise ValueError("Too few arguments!")

    model_def = args[0]
    model_weights = args[1]
    export_path = args[2]
    export_model(model_def, model_weights, export_path)

Step 3

gsutil cp -r fasttext_cloud/ gs://quiet-notch-xyz.appspot.com

Step 4

from __future__ import print_function

from oauth2client.client import GoogleCredentials
from googleapiclient import discovery
from googleapiclient import errors
import time

projectID = 'projects/{}'.format('quiet-notch-xyz')
modelName = 'fasttext'
modelID = '{}/models/{}'.format(projectID, modelName)
versionName = 'Initial'
versionDescription = 'Initial release.'
trainedModelLocation = 'gs://quiet-notch-xyz.appspot.com/fasttext/'

credentials = GoogleCredentials.get_application_default()
ml = discovery.build('ml', 'v1', credentials=credentials)

# Create a dictionary with the fields from the request body.
requestDict = {'name': modelName, 'description': 'Online predictions.'}

# Create a request to call projects.models.create.
request = ml.projects().models().create(parent=projectID, body=requestDict)

# Make the call.
try:
    response = request.execute()
except errors.HttpError as err: 
    # Something went wrong, print out some information.
    print('There was an error creating the model.' +
        ' Check the details:')
    print(err._get_reason())

    # Clear the response for next time.
    response = None
    raise


time.sleep(10)

requestDict = {'name': versionName,
               'description': versionDescription,
               'deploymentUri': trainedModelLocation}

# Create a request to call projects.models.versions.create
request = ml.projects().models().versions().create(parent=modelID,
              body=requestDict)

# Make the call.
try:
    print("Creating model setup..", end=' ')
    response = request.execute()

    # Get the operation name.
    operationID = response['name']
    print('Done.')

except errors.HttpError as err:
    # Something went wrong, print out some information.
    print('There was an error creating the version.' +
          ' Check the details:')
    print(err._get_reason())
    raise

done = False
request = ml.projects().operations().get(name=operationID)
print("Adding model from storage..", end=' ')

while (not done):
    response = None

    # Wait for 10000 milliseconds.
    time.sleep(10)

    # Make the next call.
    try:
        response = request.execute()

        # Check for finish.
        done = True # response.get('done', False)

    except errors.HttpError as err:
        # Something went wrong, print out some information.
        print('There was an error getting the operation.' +
              'Check the details:')
        print(err._get_reason())
        done = True
        raise

print("Done.")

Step 5

Use website.

Step 6

def predict_json(instances, project='quiet-notch-xyz', model='fasttext', version=None):
    """Send json data to a deployed model for prediction.

    Args:
        project (str): project where the Cloud ML Engine Model is deployed.
        model (str): model name.
        instances ([Mapping[str: Any]]): Keys should be the names of Tensors
            your deployed model expects as inputs. Values should be datatypes
            convertible to Tensors, or (potentially nested) lists of datatypes
            convertible to tensors.
        version: str, version of the model to target.
    Returns:
        Mapping[str: any]: dictionary of prediction results defined by the
            model.
    """
    # Create the ML Engine service object.
    # To authenticate set the environment variable
    # GOOGLE_APPLICATION_CREDENTIALS=<path_to_service_account_file>
    service = googleapiclient.discovery.build('ml', 'v1')
    name = 'projects/{}/models/{}'.format(project, model)

    if version is not None:
        name += '/versions/{}'.format(version)

    response = service.projects().predict(
        name=name,
        body={'instances': instances}
    ).execute()

    if 'error' in response:
        raise RuntimeError(response['error'])

    return response['predictions']

Then run function with test input: predict_json({'inputs':[[18, 87, 13, 589, 0]]})

Liva answered 22/8, 2017 at 3:47 Comment(6)
There might be more than issue, but let's start here: CloudML Engine currently only supports using a single signature (the default signature). Looking at your code, I think prediction_signature is more likely to lead to success, but you haven't made that the default signature. Can you try that? Since deploying to the cloud can take some time, I recommend using the following to test locally: gcloud ml-engine local predictStreamy
Okay, that seems reasonable. I must admit that I didn't quite understand the builder.add_meta_graph_and_variables() function. How do I change it to be the default signature?Liva
This is going to be messy, so I'll add it as an answer, which I'll continously update until we've solved the problem(s). btw, I've simplified the saved model process with a simple_save function that's incubating in contrib and should be available in TF 1.4: github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/…Streamy
The simple_save function looks great! Sadly, I cannot update my TF version (and I think my current version is too old for the things in contrib), but I'll look forward to using it once TF 1.4 is out and I've updated!Liva
Hmm.. I'm trying to run gcloud ml-engine local predict --model-dir=fasttext_cloud/ --json-instances=debug_instance.json, but it cannot load Tensorflow. This is strange as Tensorflow works fine for everything else, including for the example mentioned: python -c 'import tensorflow'. I'll open a new issue with this.Liva
This is the new issue I've opened: #45810054Liva
S
2

There is now a sample demonstrating the use of Keras on CloudML engine, including prediction. You can find the sample here:

https://github.com/GoogleCloudPlatform/cloudml-samples/tree/master/census/keras

I would suggest comparing your code to that code.

Some additional suggestions that will still be relevant:

CloudML Engine currently only supports using a single signature (the default signature). Looking at your code, I think prediction_signature is more likely to lead to success, but you haven't made that the default signature. I suggest the following:

builder.add_meta_graph_and_variables(
            sess, [tf.saved_model.tag_constants.SERVING],
            signature_def_map={tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: prediction_signature,},
            legacy_init_op=legacy_init_op)

If you are deploying to the service, then you would invoke prediction like so:

predict_json({'images':[[18, 87, 13, 589, 0]]})

If you are testing locally using gcloud ml-engine local predict --json-instances the input data is slightly different (matches that of the batch prediction service). Each newline-separated line looks like this (showing a file with two lines):

{'images':[[18, 87, 13, 589, 0]]}
{'images':[[21, 85, 13, 100, 1]]}

I don't actually know enough about the shape of model.x to ensure the data being sent is correct for your model.

By way of explanation, it may be insightful to consider the difference between the Classification and Prediction methods in SavedModel. One difference is that, when using tensorflow_serving, which is based on gRPC, which is strongly typed, Classification provides a strongly-typed signature that most classifiers can use. Then you can reuse the same client on any classifier.

That's not overly useful when using JSON since JSON isn't strongly typed.

One other difference is that, when using tensorflow_serving, Prediction accepts column-based inputs (a map from feature name to every value for that feature in the whole batch) whereas Classification accepts row based inputs (each input instance/example is a row).

CloudML abstracts that away a bit and always requires row-based inputs (a list of instances). We even though we only officially support Prediction, but Classification should work as well.

Streamy answered 22/8, 2017 at 4:59 Comment(6)
Thanks! I'm trying that right now. Can you explain a bit high-level what the difference is between the prediction signature and the classification signature? I got it from a Tensorflow tutorial on MNIST if I recall correctly.Liva
added an explanation.Streamy
Great explanation, thanks. I just tried running the model on the cloud (cannot run locally) after having exported it with this new signature as default. I get the same error. Any ideas?Liva
Added a link to a Keras sample. In addition, can you provide details about the shape of model.x?Streamy
The input is sentences of length 30 (i.e. each observation has an array of 30 integers). The output is a Softmax over 64 possible classes.Liva
In the example in my original post I shortened it to be length 5 by just removing 25 padding integers.Liva

© 2022 - 2024 — McMap. All rights reserved.