Tensorflow Estimator predict is slow
Asked Answered
S

2

9

I have trained a tf.estimator.LinearClassifier. While training and evaluating the model takes a reasonable amount of time for my data size (~60 sec), predicting takes many order of magnitude longer (~1 hour).

The prediction code is as follow:

predictionResult = estimator.predict(input_fn=lambda: my_input_fn2(predictionValidationFile, False, 1))
predictionList = [prediction for prediction in  predictionResult]

with:

def my_input_fn2(file_path, perform_shuffle=False, repeat_count=1):
def _parse_function(example_proto):      
  keys_to_features = {"xslm": tf.FixedLenFeature([10000], tf.float32),
                      "xrnn": tf.FixedLenFeature([10000], tf.float32),
                      "target": tf.FixedLenFeature([10000], tf.float32)}
  parsed_features = tf.parse_single_example(example_proto, keys_to_features)      
  myfeatures = {'xrnn':parsed_features['xrnn'], 'xslm':parsed_features['xslm']}
  return myfeatures, parsed_features['target'] 

dataset = (tf.data.TFRecordDataset(file_path)                
           .map(_parse_function))     
dataset = dataset.repeat(repeat_count) 
dataset = dataset.batch(1)  
iterator = dataset.make_one_shot_iterator()
batch_feature,  batch_labels = iterator.get_next()    
xs= tf.reshape(batch_feature['xslm'],[-1,1])
xr= tf.reshape(batch_feature['xrnn'],[-1,1])
x = {'xrnn':xr, 'xslm':xs}
y = tf.reshape(batch_labels, [-1,1])
return x, y

The second line takes 0.8 sec to exececute when ran for 10 000 samples (corresponding to one batch). With 50 000 000 samples, prediction takes more than one hour.

My guess at this stage is that this slow performance is simply caused by the fact that the estimator predict() function is returning a python generator instead of returning the actual prediction results. For each batch, the generator eventually cause 10 000 calls to a function to get the 10 000 prediction results. This seems inefficient.

Are there any options to speed things up?

Stormproof answered 13/12, 2017 at 15:22 Comment(5)
Could you post your implementation of input_fn? Also, a mini-batch of 10,000 seems pretty high regardless of how big your model might be.Ximenes
I have added the input_fn code. I am using TFRecords to ensure decent performance and there is a slight twist where I reshape the TFRecord example to a tensor of shape [10000,1] to feed 10 000 sample quickly but this should not be an issue since the same code is used by estimator.train() and estimator.evaluate(). My model is pretty small (LinearClassifier with two features) but the dataset is faily large (50 000 000 samples). My guess is that the real issue comes from the design of the predict() function which rely on generator.Stormproof
I wonder if that reshape with batch_size = 1 isn't adding substantial work to this. What you could do is time how long that takes with input = my_input_fn2(filename) with tf.Session() as sess: sess.run(input) then experiment with batch = 16 or something, as well as without the reshapeXimenes
I noticed the same issue. I have to change the predict to evaluate. I don't know why predict does not return the results directly.Valence
Also not that estimator.predict does not support distributed evaluation. So that isn't an option to speed it up either...Yacov
S
0

You are right about the reason for it being slow. It is making a calls to the function for each item as your bach size in the fuctions is default to 1.

You should pass batch size to the function as parameter and replace

dataset = dataset.batch(1) 

with

dataset = dataset.batch(batch_size) 
Sonata answered 21/12, 2018 at 15:15 Comment(0)
C
0

I had a similar problem (using tensorflow 1.15 in a colab notebook). In my case, saving and loading the model (in a new cell) solved the problem.

model.save_weights("weights.h5", overwrite=True)
# in a new cell
model = create_model()
model.load_weights("weights.h5")
y_pred = np.array(model.predict(x_test))
Cynthla answered 11/11, 2019 at 18:27 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.