How to use tf.data's initializable iterators within a tf.estimator's input_fn?

About

Asked 10/7, 2017 at 12:12 Answered 10/7, 2017 at 15:51

Solved python tensorflow tensorflow-datasets tensorflow-estimator

I would like to manage my training with a tf.estimator.Estimator but have some trouble to use it alongside the tf.data API.

I have something like this:

def model_fn(features, labels, params, mode):
  # Defines model's ops.
  # Initializes with tf.train.Scaffold.
  # Returns an tf.estimator.EstimatorSpec.

def input_fn():
  dataset = tf.data.TextLineDataset("test.txt")
  # map, shuffle, padded_batch, etc.

  iterator = dataset.make_initializable_iterator()

  return iterator.get_next()

estimator = tf.estimator.Estimator(model_fn)
estimator.train(input_fn)

As I can't use a make_one_shot_iterator for my use case, my issue is that input_fn contains an iterator that should be initialized within model_fn (here, I use tf.train.Scaffold to initialize local ops).

Also, I understood that we can't only use input_fn = iterator.get_next otherwise the other ops will not be added to the same graph.

What is the recommended way to initialize the iterator?

Gendron answered 10/7, 2017 at 12:12 Comment(2)

@guillaumeklin -- did you add tf.add_to_collection(tf.GraphKeys.TABLE_INITIALIZERS, iterator.initializer) within the input_fn()? – Gotcher 14/2, 2018 at 14:7

Yes, you can add this line in input_fn() just before return iterator.get_next(). – Gendron 14/2, 2018 at 14:39

As of TensorFlow 1.5, it is possible to make input_fn return a tf.data.Dataset, e.g.:

def input_fn():
  dataset = tf.data.TextLineDataset("test.txt")
  # map, shuffle, padded_batch, etc.
  return dataset

See c294fcfd.

For previous versions, you can add the iterator's initializer in the tf.GraphKeys.TABLE_INITIALIZERS collections and rely on the default initializer.

tf.add_to_collection(tf.GraphKeys.TABLE_INITIALIZERS, iterator.initializer)

Gendron answered 10/7, 2017 at 15:51 Comment(4)

Thanks! +1. Just to clarify the answer: need to add the tf.add_to_collection... line before returning input_fn() and then it works fine and don't need to do anything with Scaffold and local_init_ops. – Hypocotyl 12/12, 2017 at 12:16

Excuse me, is it possible to specify names for each field of the dataset using the first method? For example, my dataset has 2 fields: "age" and "sex", and I want to return a dictionary looks like: {"age": tensor1, "sex": tensor2}. – Karr 9/10, 2018 at 13:23

@Hypocotyl @Gendron did you add the tf.add_to_collection(...) line within the def input_fn() or elsewhere within the model_fn()? If this was added in the model_fn() then would the line still be tf.add_to_collection(tf.GraphKeys.TABLE_INITIALIZERS, iterator.initializer) or would iterator.initializer need to be changed to something else? – Gotcher 23/10, 2018 at 15:55

You should add it in input_fn(), just after the creation of the iterator. – Gendron 23/10, 2018 at 16:33

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags