How to tell Keras stop training based on loss value?
Asked Answered
A

7

91

Currently I use the following code:

callbacks = [
    EarlyStopping(monitor='val_loss', patience=2, verbose=0),
    ModelCheckpoint(kfold_weights_path, monitor='val_loss', save_best_only=True, verbose=0),
]
model.fit(X_train.astype('float32'), Y_train, batch_size=batch_size, nb_epoch=nb_epoch,
      shuffle=True, verbose=1, validation_data=(X_valid, Y_valid),
      callbacks=callbacks)

It tells Keras to stop training when loss didn't improve for 2 epochs. But I want to stop training after loss became smaller than some constant "THR":

if val_loss < THR:
    break

I've seen in documentation there are possibility to make your own callback: http://keras.io/callbacks/ But nothing found how to stop training process. I need an advice.

Atherton answered 18/5, 2016 at 8:2 Comment(0)
A
96

I found the answer. I looked into Keras sources and find out code for EarlyStopping. I made my own callback, based on it:

class EarlyStoppingByLossVal(Callback):
    def __init__(self, monitor='val_loss', value=0.00001, verbose=0):
        super(Callback, self).__init__()
        self.monitor = monitor
        self.value = value
        self.verbose = verbose

    def on_epoch_end(self, epoch, logs={}):
        current = logs.get(self.monitor)
        if current is None:
            warnings.warn("Early stopping requires %s available!" % self.monitor, RuntimeWarning)

        if current < self.value:
            if self.verbose > 0:
                print("Epoch %05d: early stopping THR" % epoch)
            self.model.stop_training = True

And usage:

callbacks = [
    EarlyStoppingByLossVal(monitor='val_loss', value=0.00001, verbose=1),
    # EarlyStopping(monitor='val_loss', patience=2, verbose=0),
    ModelCheckpoint(kfold_weights_path, monitor='val_loss', save_best_only=True, verbose=0),
]
model.fit(X_train.astype('float32'), Y_train, batch_size=batch_size, nb_epoch=nb_epoch,
      shuffle=True, verbose=1, validation_data=(X_valid, Y_valid),
      callbacks=callbacks)
Atherton answered 18/5, 2016 at 9:56 Comment(7)
Just if it will be useful for someone - in my case I used monitor='loss', it worked well.Khudari
It seems Keras has been updated. The EarlyStopping callback function has min_delta built into it now. No need to hack the source code anymore, yay! https://mcmap.net/q/234803/-how-to-tell-keras-stop-training-based-on-loss-valueBireme
Upon re-reading the question and answers, I need to correct myself: min_delta means "Stop early if there is not enough improvement per epoch (or per multiple epochs)." However, the OP asked how to "Stop early when the loss gets below a certain level."Bireme
NameError: name 'Callback' is not defined... How will I fix it?Fabianfabianism
Eliyah try this: from keras.callbacks import CallbackAtherton
one correction it should be elif elif current < self.value:Mokpo
@Bireme min_delta doesn't quite address the question of early stopping by an absolute value. Instead min_delta works as a difference between valuesCatamenia
K
26

The keras.callbacks.EarlyStopping callback does have a min_delta argument. From Keras documentation:

min_delta: minimum change in the monitored quantity to qualify as an improvement, i.e. an absolute change of less than min_delta, will count as no improvement.

Kersey answered 4/1, 2017 at 8:34 Comment(4)
For reference, here are the docs for an earlier version of Keras (1.1.0) in which the min_delta argument was not yet included: faroit.github.io/keras-docs/1.1.0/callbacks/#earlystoppingBireme
how could I make it not stop until min_delta persists over multiple epochs?Carpo
there's another parameter to EarlyStopping called patience: number of epochs with no improvement after which training will be stopped.Kersey
While min_delta might be useful, it doesn't quite address the question of early stopping by an absolute value. Instead, min_delta works as a difference between valuesCatamenia
S
14

One solution is to call model.fit(nb_epoch=1, ...) inside a for loop, then you can put a break statement inside the for loop and do whatever other custom control flow you want.

Shot answered 18/5, 2016 at 9:29 Comment(1)
It'd be nice if they made a callback that takes in a single function that can do that.Brittenybrittingham
D
11

I solved the same problem using custom callback.

In the following custom callback code assign THR with the value at which you want to stop training and add the callback to your model.

from keras.callbacks import Callback

class stopAtLossValue(Callback):

        def on_batch_end(self, batch, logs={}):
            THR = 0.03 #Assign THR with the value at which you want to stop training.
            if logs.get('loss') <= THR:
                 self.model.stop_training = True
Dwan answered 2/3, 2019 at 14:41 Comment(0)
L
2

While I was taking the TensorFlow in practice specialization, I learned a very elegant technique. Just little modified from the accepted answer.

Let's set the example with our favorite MNIST data.

import tensorflow as tf

class new_callback(tf.keras.callbacks.Callback):
    def epoch_end(self, epoch, logs={}): 
        if(logs.get('accuracy')> 0.90): # select the accuracy
            print("\n !!! 90% accuracy, no further training !!!")
            self.model.stop_training = True

mnist = tf.keras.datasets.mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0 #normalize

callbacks = new_callback()

# model = tf.keras.models.Sequential([# define your model here])

model.compile(optimizer=tf.optimizers.Adam(),
          loss='sparse_categorical_crossentropy',
          metrics=['accuracy'])
model.fit(x_train, y_train, epochs=10, callbacks=[callbacks])

So, here I set the metrics=['accuracy'], and thus in the callback class the condition is set to 'accuracy'> 0.90.

You can choose any metric and monitor the training like this example. Most importantly you can set different conditions for different metric and use them simultaneously.

Hopefully this helps!

Lorenzen answered 13/4, 2020 at 15:33 Comment(1)
function name should be on_epoch_endAseptic
U
0

For me the model would only stop training if I added a return statement after setting the stop_training parameter to True because I was calling after self.model.evaluate. So either make sure to put stop_training = True at the end of the function or add a return statement.

def on_epoch_end(self, batch, logs):
        self.epoch += 1
        self.stoppingCounter += 1
        print('\nstopping counter \n',self.stoppingCounter)

        #Stop training if there hasn't been any improvement in 'Patience' epochs
        if self.stoppingCounter >= self.patience:
            self.model.stop_training = True
            return

        # Test on additional set if there is one
        if self.testingOnAdditionalSet:
            evaluation = self.model.evaluate(self.val2X, self.val2Y, verbose=0)
            self.validationLoss2.append(evaluation[0])
            self.validationAcc2.append(evaluation[1])enter code here
Ukrainian answered 10/4, 2020 at 4:17 Comment(0)
J
-1

If you're using a custom training loop, you can use a collections.deque, which is a "rolling" list which can be appended, and the left-hand items gets popped out when the list is longer than maxlen. Here's the line:

loss_history = deque(maxlen=early_stopping + 1)

for epoch in range(epochs):
    fit(epoch)
    loss_history.append(test_loss.result().numpy())
    if len(loss_history) > early_stopping and loss_history.popleft() < min(loss_history)
            break

Here's a full example:

import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
import tensorflow_datasets as tfds
import tensorflow as tf
from tensorflow.keras.layers import Dense
from collections import deque

data, info = tfds.load('iris', split='train', as_supervised=True, with_info=True)

data = data.map(lambda x, y: (tf.cast(x, tf.int32), y))

train_dataset = data.take(120).batch(4)
test_dataset = data.skip(120).take(30).batch(4)

model = tf.keras.models.Sequential([
    Dense(8, activation='relu'),
    Dense(16, activation='relu'),
    Dense(info.features['label'].num_classes)])

loss_object = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

train_loss = tf.keras.metrics.Mean()
test_loss = tf.keras.metrics.Mean()

train_acc = tf.keras.metrics.SparseCategoricalAccuracy()
test_acc = tf.keras.metrics.SparseCategoricalAccuracy()

opt = tf.keras.optimizers.Adam(learning_rate=1e-3)


@tf.function
def train_step(inputs, labels):
    with tf.GradientTape() as tape:
        logits = model(inputs, training=True)
        loss = loss_object(labels, logits)

    gradients = tape.gradient(loss, model.trainable_variables)
    opt.apply_gradients(zip(gradients, model.trainable_variables))
    train_loss(loss)
    train_acc(labels, logits)


@tf.function
def test_step(inputs, labels):
    logits = model(inputs, training=False)
    loss = loss_object(labels, logits)
    test_loss(loss)
    test_acc(labels, logits)


def fit(epoch):
    template = 'Epoch {:>2} Train Loss {:.3f} Test Loss {:.3f} ' \
               'Train Acc {:.2f} Test Acc {:.2f}'

    train_loss.reset_states()
    test_loss.reset_states()
    train_acc.reset_states()
    test_acc.reset_states()

    for X_train, y_train in train_dataset:
        train_step(X_train, y_train)

    for X_test, y_test in test_dataset:
        test_step(X_test, y_test)

    print(template.format(
        epoch + 1,
        train_loss.result(),
        test_loss.result(),
        train_acc.result(),
        test_acc.result()
    ))


def main(epochs=50, early_stopping=10):
    loss_history = deque(maxlen=early_stopping + 1)

    for epoch in range(epochs):
        fit(epoch)
        loss_history.append(test_loss.result().numpy())
        if len(loss_history) > early_stopping and loss_history.popleft() < min(loss_history):
            print(f'\nEarly stopping. No validation loss '
                  f'improvement in {early_stopping} epochs.')
            break

if __name__ == '__main__':
    main(epochs=250, early_stopping=10)
Epoch  1 Train Loss 1.730 Test Loss 1.449 Train Acc 0.33 Test Acc 0.33
Epoch  2 Train Loss 1.405 Test Loss 1.220 Train Acc 0.33 Test Acc 0.33
Epoch  3 Train Loss 1.173 Test Loss 1.054 Train Acc 0.33 Test Acc 0.33
Epoch  4 Train Loss 1.006 Test Loss 0.935 Train Acc 0.33 Test Acc 0.33
Epoch  5 Train Loss 0.885 Test Loss 0.846 Train Acc 0.33 Test Acc 0.33
...
Epoch 89 Train Loss 0.196 Test Loss 0.240 Train Acc 0.89 Test Acc 0.87
Epoch 90 Train Loss 0.195 Test Loss 0.239 Train Acc 0.89 Test Acc 0.87
Epoch 91 Train Loss 0.195 Test Loss 0.239 Train Acc 0.89 Test Acc 0.87
Epoch 92 Train Loss 0.194 Test Loss 0.239 Train Acc 0.90 Test Acc 0.87

Early stopping. No validation loss improvement in 10 epochs.
Jankell answered 17/8, 2020 at 20:37 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.