Save a model for TensorFlow Serving with api endpoint mapped to certain method using SignatureDefs?
Asked Answered
C

1

7

I recently went through this tutorial. I have the trained model from the tutorial and I want to serve it with docker so I can send an arbitrary string of characters to it and get the prediction back from the model.

I also went through this tutorial to understand how to serve with docker. But I didn't comprehend how the model was saved with the ability to accept input parameters. For example:

    curl -d '{"instances": [1.0, 2.0, 5.0]}' \
        -X POST http://localhost:8501/v1/models/half_plus_two:predict

How does the half_plus_two model know what to do with the instances param?

In the text generation tutorial, there is a method called generate_text that handles generating predictions.

    def generate_text(model, start_string):
        # Evaluation step (generating text using the learned model)

        # Number of characters to generate
        num_generate = 1000

        # Converting our start string to numbers (vectorizing) 
        input_eval = [char2idx[s] for s in start_string]
        input_eval = tf.expand_dims(input_eval, 0)

        # Empty string to store our results
        text_generated = []

        # Low temperatures results in more predictable text.
        # Higher temperatures results in more surprising text.
        # Experiment to find the best setting.
        temperature = 1.0

        # Here batch size == 1
        model.reset_states()
        for i in range(num_generate):
            predictions = model(input_eval)
            # remove the batch dimension
            predictions = tf.squeeze(predictions, 0)

            # using a multinomial distribution to predict the word returned by the model
            predictions = predictions / temperature
            predicted_id = tf.multinomial(predictions, num_samples=1)[-1,0].numpy()

            # We pass the predicted word as the next input to the model
            # along with the previous hidden state
            input_eval = tf.expand_dims([predicted_id], 0)

            text_generated.append(idx2char[predicted_id])

        return (start_string + ''.join(text_generated)) 

How can I serve the trained model from the text generation tutorial and have input parameters to the model api mapped to unique methods such as generate_text? For example:

    curl -d '{"start_string": "ROMEO: "}' \
        -X POST http://localhost:8501/v1/models/text_generation:predict
Cadmar answered 5/3, 2019 at 16:55 Comment(1)
I think you write an easy flask app for you to serve the results or use a lambda function implementation (like of AWS) to get this served.Hate
T
4

Note: Answering this completely and extensively would require going in depth on the Serving architecture, its APIs and how they interact with models' signatures. I'll skip over all of this to keep the answer to an acceptable length, but I can always expand on excessively obscure parts if necessary (leave a comment if that's the case).

How does the half_plus_two model know what to do with the instances param?

Because of several unmentioned reasons that pile up to make this a conveniently short example, if only IMO a bit misleading.

1) Where does the instances parameter come from? The definition of the Predict API for the RESTful API has a predefined request format that, in one of its two possible forms, takes one instances parameter.

2) What does the instances parameter map to? We don't know. for SignatureDefs with just one input, instances in that very specific calling format maps directly to the input without need to specifying the input's key (see section "Specifying input tensors in row format" in the API specs).

So, what happens is: You make a POST request to a model with just one input defined. TF Serving takes that input and feeds it to the model, runs it until it has all the values for the tensors defined in the "outputs" part of the model's signature and returns you a JSON object with key:result items for each key in the "outputs" list.

How can I serve the trained model from the text generation tutorial and have input parameters to the model api mapped to unique methods such as generate_text?

You can't (not directly mapping a function to a Serving method, at least). The Serving infrastructure exposes some predefined methods (regress, predict, classify) that know how to interpret the signatures to produce the output you requested by running specific subgraphs of the model. These subgraphs must be included in the SavedModel, so for example using tf.py_func won't work.

Your best chance is to try to describe text generation as a TF subgraph (i.e. using exclusively TF operations) and writing a separate SignatureDef that takes the start string and num_generate as inputs.

Took answered 14/3, 2019 at 11:43 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.