The code below is the same Hello-World example from kera-tuner website, but using Hyperband instead of RandomSearch.
from tensorflow import keras
from tensorflow.keras import layers
from kerastuner.tuners import RandomSearch, Hyperband
from kerastuner.engine.hypermodel import HyperModel
from kerastuner.engine.hyperparameters import HyperParameters
(x, y), (val_x, val_y) = keras.datasets.mnist.load_data()
x = x.astype('float32') / 255.
val_x = val_x.astype('float32') / 255.
x = x[:10000]
y = y[:10000]
def build_model(hp):
model = keras.Sequential()
model.add(layers.Flatten(input_shape=(28, 28)))
for i in range(hp.Int('num_layers', 2, 20)):
model.add(layers.Dense(units=hp.Int('units_' + str(i), 32, 512, 32),
activation='relu'))
model.add(layers.Dense(10, activation='softmax'))
model.compile(
optimizer=keras.optimizers.Adam(
hp.Choice('learning_rate', [1e-2, 1e-3, 1e-4])),
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
return model
tuner = Hyperband(
build_model,
max_epochs=50,
objective='val_accuracy',
seed=20,
executions_per_trial=1,
directory='test_dir',
project_name='daninhas_hyperband'
)
# tuner.search_space_summary()
tuner.search(x=x,
y=y,
epochs=50,
validation_data=(val_x, val_y))
tuner.results_summary()
But even having max_epochs=50
and epoch=50
the model training is running for only 2 epochs.
(...)
2020-06-03 12:55:23.245993: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-06-03 12:55:23.246022: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108]
Epoch 1/2
313/313 [==============================] - 4s 12ms/step - loss: 2.3247 - accuracy: 0.1109 - val_loss: 2.3025 - val_accuracy: 0.1135
Epoch 2/2
313/313 [==============================] - 3s 9ms/step - loss: 2.3020 - accuracy: 0.1081 - val_loss: 2.3033 - val_accuracy: 0.1135
[Trial complete]
[Trial summary]
|-Trial ID: 32396974f43cade5b6c3ef511b5548f3
|-Score: 0.11349999904632568
|-Best step: 0
> Hyperparameters:
|-learning_rate: 0.01
|-num_layers: 19
|-tuner/bracket: 3
|-tuner/epochs: 2
|-tuner/initial_epoch: 0
|-tuner/round: 0
|-units_0: 448
|-units_1: 384
|-units_10: 416
|-units_11: 160
|-units_12: 384
|-units_13: 480
|-units_14: 288
|-units_15: 64
|-units_16: 288
|-units_17: 64
|-units_18: 32
|-units_2: 96
|-units_3: 160
|-units_4: 480
|-units_5: 416
|-units_6: 256
|-units_7: 32
|-units_8: 160
|-units_9: 448
Epoch 1/2
313/313 [==============================] - 4s 11ms/step - loss: 2.3109 - accuracy: 0.1081 - val_loss: 2.3028 - val_accuracy: 0.1135
Epoch 2/2
313/313 [==============================] - 3s 10ms/step - loss: 2.3022 - accuracy: 0.1067 - val_loss: 2.3019 - val_accuracy: 0.1135
[Trial complete]
[Trial summary]
|-Trial ID: 98376f698826a2068c3412301a7aece4
|-Score: 0.11349999904632568
|-Best step: 0
> Hyperparameters:
|-learning_rate: 0.01
|-num_layers: 19
|-tuner/bracket: 3
|-tuner/epochs: 2
|-tuner/initial_epoch: 0
|-tuner/round: 0
|-units_0: 480
|-units_1: 320
|-units_10: 320
|-units_11: 64
|-units_12: 128
|-units_13: 32
|-units_14: 416
|-units_15: 288
|-units_16: 320
|-units_17: 480
|-units_18: 256
|-units_2: 480
|-units_3: 320
|-units_4: 288
|-units_5: 192
|-units_6: 224
|-units_7: 256
|-units_8: 256
|-units_9: 352
(...)
Is there some configuration that force only 2 epochs ? Or could this be some bug ? Or Am I missing something ?
How to get the model be trained for more epochs ?