Tensorflow MNIST Estimator: batch size affects the graph expected input?
Asked Answered
L

1

6

I have followed the TensorFlow MNIST Estimator tutorial and I have trained my MNIST model.
It seems to work fine, but if I visualize it on Tensorboard I see something weird: the input shape that the model requires is 100 x 784.

Here is a screenshot: as you can see in the right box, expected input size is 100x784.
I thought I would see ?x784 there.

see? the expected size for input is 100 x 784

Now, I did use 100 as a batch size in training, but in the Estimator model function I also specified that the amount of input samples size is variable. So I expected ? x 784 to be shown in Tensorboard.

input_layer = tf.reshape(features["x"], [-1, 28, 28, 1], name="input_layer")

I tried to use the estimator.train and estimator.evaluate methods on the same model with different batch sizes (e.g. 50), and to use the Estimator.predict method passing a single sample at a time. In these cases, everything seemed to works fine.

On the contrary, I do get problems if I try to use the model without passing through the Estimator interface. For example, if I freeze my model and try to load it in a GraphDef and run it in a session, like this:

with tf.gfile.GFile("/path/to/my/frozen/model.pb", "rb") as f:
    graph_def = tf.GraphDef()
    graph_def.ParseFromString(f.read())

with tf.Graph().as_default() as graph:
    tf.import_graph_def(graph_def, name="prefix")

    x = graph.get_tensor_by_name('prefix/input_layer:0')
    y = graph.get_tensor_by_name('prefix/softmax_tensor:0')

    with tf.Session(graph=graph) as sess:
        y_out = sess.run(y, feed_dict={x: 28_x_28_image})

I will get the following error:
ValueError: Cannot feed value of shape (1, 28, 28, 1) for Tensor 'prefix/input_layer:0', which has shape '(100, 28, 28, 1)'

This worries me a lot, because in production I do need to freeze, optimize and convert my models to run them on TensorFlow Lite. So I won't be using the Estimator interface.

What am I missing?

Luker answered 9/4, 2018 at 19:25 Comment(0)
F
4

tf.reshape won't discard shape information for -1 dimensions. That's just a shorthand for "whatever's left over":

>>> import tensorflow as tf
>>> a = tf.constant([1.,2.,3.])
>>> a.shape
TensorShape([Dimension(3)])
>>> tf.reshape(a, [-1, 3]).shape
TensorShape([Dimension(1), Dimension(3)])
>>> 

If you want to destroy static shape information, see tf.placeholder_with_default:

>>> tf.placeholder_with_default(a[None, :], shape=[None, 3]).shape
TensorShape([Dimension(None), Dimension(3)])
Feral answered 11/4, 2018 at 16:53 Comment(1)
thank you so much! I have tried this and it works like a charm. I really hope the TF team adds this to their tutorials. Without this, if you train a model through an Estimator you need to use the Estimator interface for inference too, but this is not always possible. I leave here the detail of how I have modified my code: batch = features["x"] # shape (100, 784) reshaped = tf.reshape(batch, (-1, 28, 28, 1)) # shape (100, 28, 28, 1) reshape_layer = tf.placeholder_with_default(reshaped, (None, 28, 28, 1), name="reshape_layer") # shape (?, 28, 28, 1)Luker

© 2022 - 2024 — McMap. All rights reserved.