Keras-tuner Hyperband runing only 2 epochs
Asked Answered
H

1

32

The code below is the same Hello-World example from kera-tuner website, but using Hyperband instead of RandomSearch.

from tensorflow import keras
from tensorflow.keras import layers

from kerastuner.tuners import RandomSearch, Hyperband
from kerastuner.engine.hypermodel import HyperModel
from kerastuner.engine.hyperparameters import HyperParameters

(x, y), (val_x, val_y) = keras.datasets.mnist.load_data()
x = x.astype('float32') / 255.
val_x = val_x.astype('float32') / 255.

x = x[:10000]
y = y[:10000]

def build_model(hp):
    model = keras.Sequential()
    model.add(layers.Flatten(input_shape=(28, 28)))
    for i in range(hp.Int('num_layers', 2, 20)):
        model.add(layers.Dense(units=hp.Int('units_' + str(i), 32, 512, 32),
                               activation='relu'))
    model.add(layers.Dense(10, activation='softmax'))
    model.compile(
        optimizer=keras.optimizers.Adam(
            hp.Choice('learning_rate', [1e-2, 1e-3, 1e-4])),
        loss='sparse_categorical_crossentropy',
        metrics=['accuracy'])
    return model

tuner = Hyperband(
    build_model,
    max_epochs=50,
    objective='val_accuracy',
    seed=20,
    executions_per_trial=1,
    directory='test_dir',
    project_name='daninhas_hyperband'
)    

# tuner.search_space_summary()

tuner.search(x=x,
             y=y,
             epochs=50,
             validation_data=(val_x, val_y))

tuner.results_summary()

But even having max_epochs=50 and epoch=50 the model training is running for only 2 epochs.

(...)
2020-06-03 12:55:23.245993: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-06-03 12:55:23.246022: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108]      
Epoch 1/2
313/313 [==============================] - 4s 12ms/step - loss: 2.3247 - accuracy: 0.1109 - val_loss: 2.3025 - val_accuracy: 0.1135
Epoch 2/2
313/313 [==============================] - 3s 9ms/step - loss: 2.3020 - accuracy: 0.1081 - val_loss: 2.3033 - val_accuracy: 0.1135
[Trial complete]
[Trial summary]
 |-Trial ID: 32396974f43cade5b6c3ef511b5548f3
 |-Score: 0.11349999904632568
 |-Best step: 0
 > Hyperparameters:
 |-learning_rate: 0.01
 |-num_layers: 19
 |-tuner/bracket: 3
 |-tuner/epochs: 2
 |-tuner/initial_epoch: 0
 |-tuner/round: 0
 |-units_0: 448
 |-units_1: 384
 |-units_10: 416
 |-units_11: 160
 |-units_12: 384
 |-units_13: 480
 |-units_14: 288
 |-units_15: 64
 |-units_16: 288
 |-units_17: 64
 |-units_18: 32
 |-units_2: 96
 |-units_3: 160
 |-units_4: 480
 |-units_5: 416
 |-units_6: 256
 |-units_7: 32
 |-units_8: 160
 |-units_9: 448
Epoch 1/2
313/313 [==============================] - 4s 11ms/step - loss: 2.3109 - accuracy: 0.1081 - val_loss: 2.3028 - val_accuracy: 0.1135
Epoch 2/2
313/313 [==============================] - 3s 10ms/step - loss: 2.3022 - accuracy: 0.1067 - val_loss: 2.3019 - val_accuracy: 0.1135
[Trial complete]
[Trial summary]
 |-Trial ID: 98376f698826a2068c3412301a7aece4
 |-Score: 0.11349999904632568
 |-Best step: 0
 > Hyperparameters:
 |-learning_rate: 0.01
 |-num_layers: 19
 |-tuner/bracket: 3
 |-tuner/epochs: 2
 |-tuner/initial_epoch: 0
 |-tuner/round: 0
 |-units_0: 480
 |-units_1: 320
 |-units_10: 320
 |-units_11: 64
 |-units_12: 128
 |-units_13: 32
 |-units_14: 416
 |-units_15: 288
 |-units_16: 320
 |-units_17: 480
 |-units_18: 256
 |-units_2: 480
 |-units_3: 320
 |-units_4: 288
 |-units_5: 192
 |-units_6: 224
 |-units_7: 256
 |-units_8: 256
 |-units_9: 352
(...)

Is there some configuration that force only 2 epochs ? Or could this be some bug ? Or Am I missing something ?

How to get the model be trained for more epochs ?

Hyperploid answered 3/6, 2020 at 16:0 Comment(4)
This is how the Hyperband algorithm works. It samples the hyper parameter space with a limited number of epochs initially to learn about the space and then iterates through more epochs for the more promising models. Use a small test dataset and let it run for a bit to see it in action.Vacua
@Vacua Is the surface after 2 epochs supposed to bear any relation to that after N=50 (or more...) epochs..?Fields
@Fields Yes, that is a key assumption of Hyperband. If the model being optimized behaves in such a way that the result after a few epochs does not relate to the result after many more epochs then you're better off using something like a Bayesian hyper parameter algorithm which runs each test model through its full course of epochs before making decisions.Vacua
Otherwise, you may also try changing default hyperband_iterations=1 parameter for the results to be more stable. Alternatively, keras-tuner now has BayesianOptimization class, but I never checked this out.Goodspeed
C
0

you can change the factor parameter to change that. By default it is set to 3, but you can increase this number to get more than 2 epoch per trial

see : docs

The Hyperband tuning algorithm uses adaptive resource allocation and early-stopping to quickly converge on a high-performing model. This is done using a sports championship style bracket. The algorithm trains a large number of models for a few epochs and carries forward only the top-performing half of models to the next round. Hyperband determines the number of models to train in a bracket by computing 1 + logfactor(max_epochs) and rounding it up to the nearest integer.

Cesarean answered 31/8, 2023 at 13:41 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.