Google cloudml Always Gives Me The Same Results
Asked Answered
A

3

2

I'm working on machine learning and I would like to use Google Cloud ml service.

At this moment, I have trained my model with retrain.py code of Tensorflow (https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/image_retraining/retrain.py#L103) and I have exported the results to a cloudml (export and export.meta files). However when I try to make a prediction of new data with command (https://cloud.google.com/ml/reference/commandline/predict):

gcloud beta ml predict

it always returns the same result (I want to predict different data). How is it possible?

My data are images that are decoded from jpeg in a text format with:

echo "{\"image_bytes\": {\"b64\": \"`base64 image.jpg`\"}}" > instances

Do you have any suggestions?

Acrospire answered 2/12, 2016 at 9:52 Comment(0)
R
2

There are multiple possible causes of this issue. The first that comes to mind is that the weights in your model may be being initialized to zero when it is imported. This can happen if there is an initialization defined in the graph (c.f. the loader). To check for this, use the following commands:

from tensorflow.contrib.session_bundle import session_bundle

session, _ = session_bundle.load_session_bundle_from_path("/path/to/model")
print(s.graph.get_collection("serving_init_op"))

If there is something in that collection, make sure that it isn't initializing variables.

If there are no initializers, make sure the weights themselves look reasonable, e.g.,

session, _ = session_bundle.load_session_bundle_from_path("/path/to/model")
print(session.run("name_of_var:0"))

If all of that checks out, then you'll probably want to pay attention to the inputs to the graph and the output after transforming those inputs. To this end, you can use session.run to run parts of the graph. For instance, you can feed a jpeg string and view the output of various steps along the way by using the appropriate feeds and fetches in a call to session.run.

For example, using the example from this post, we can load a JPEG from disk, feed it to the graph, and see what the data looks like after resizing and after scaling:

INPUT_PLACEHOLDER = 'Placeholder:0'
DECODE_AND_RESIZE = 'map/TensorArrayPack_1/TensorArrayGather:0'
SCALED = 'Mul:0'

# Read in a sample image, preferably with small dimensions.
jpg = open("/tmp/testing22222.jpg", "rb").read()

session, _ = session_bundle.load_session_bundle_from_path("/path/to/model")
resized, scaled = session.run([DECODE_AND_RESIZE, SCALED], feed_dict={INPUT_PLACEHOLDER: [jpg]})

By strategically placing the names of tensors in your graph in the fetch list, you can inspect what is going on in any given layer of the neural net, although the most likely problems reside with the inputs and/or variables.

The tricky part is figuring out the names of tensors. You can use the name property when defining most operations, which might be helpful. You can also use something like:

 print([o.name for o in session.graph.get_operations()])

To help inspect the operations in the graph.

Finally, you may also want to try running the graph locally in order to minimize the feedback cycle while debugging. Check out local_predict.py in the samples for an example of how to do this. This will help you iterate quickly to identify issues with the model itself.

Ratsbane answered 2/12, 2016 at 17:6 Comment(0)
B
1

It might also be that your inputs need to be scaled. If you have one input whose magnitude overwhelms everything else, the optimization might be poor. This is what is happening if the result you get is close to the mean of the target variable.

This is less likely in your particular case because your inputs are images, do your input values are probably similarly scaled, but more common if you are training from, say, csv files.

Buenrostro answered 3/12, 2016 at 2:13 Comment(0)
B
1

Google published a blog post on the image recognition task and some associated code. It starts from the retrain.py example you mentioned, but made all the modifications for it to run on Cloud ML.

Barozzi answered 16/12, 2016 at 21:44 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.