What is the need for setting steps_per_epoch
value when calling the function fit_generator() when ideally it should be number of total samples/ batch size
?
Keras' generators are infinite.
Because of this, Keras cannot know by itself how many batches the generators should yield to complete one epoch.
When you have a static number of samples, it makes perfect sense to use samples//batch_size
for one epoch. But you may want to use a generator that performs random data augmentation for instance. And because of the random process, you will never have two identical training epochs. There isn't then a clear limit.
So, these parameters in fit_generator
allow you to control the yields per epoch as you wish, although in standard cases you'll probably keep to the most obvious option: samples//batch_size
.
Without data augmentation, the number of samples is static as Daniel mentioned. Then, the number of samples for training is steps_per_epoch * batch size.
By using ImageDataGenerator in Keras, we make additional training data for data augmentation. Therefore, the number of samples for training can be set by yourself. If you want two times training data, just set steps_per_epoch as (original sample size *2)/batch_size.
This is the safest one.
steps_per_epoch = len(train_generator).
If you have 1300 images and batch size of 64, if you set steps_per_epoch = 1300//64
, it will result in 20 steps per epoch, ignoring some image. By setting steps_per_epoch = len(train_generator)
, you ensure all 1300 images are used during each epoch, including handling the case where the last batch might be smaller than the specified batch size due to division remainder.
© 2022 - 2025 — McMap. All rights reserved.
samples // batch_size
i believe – Exocarp