Assertion failed: predictions must be >= 0, Condition x >= y did not hold element-wise
Asked Answered
C

4

11

I am running a multi-class model(total 40 class in total) for 2000 epochs. The model is running fine till 828 epoch but at 829 epoch it is giving me an InvalidArgumentError (see the screenshot below)

enter image description here

Below is the code that I used to build my model.

n_cats = 40 
input_bow = tf.keras.Input(shape=(40), name="bow")
hidden_1 = tf.keras.layers.Dense(200, activation="relu")(input_bow)

hidden_2 = tf.keras.layers.Dense(100, activation="relu")(hidden_1)

hidden_3 = tf.keras.layers.Dense(80, activation="relu")(hidden_2)

hidden_4 = tf.keras.layers.Dense(70, activation="relu")(hidden_3)

output = tf.keras.layers.Dense(n_cats, activation="sigmoid")(hidden_4)

model = tf.keras.Model(inputs=[input_bow], outputs=output)

METRICS = [
    tf.keras.metrics.Accuracy(name="Accuracy"),
    tf.keras.metrics.Precision(name="precision"),
    tf.keras.metrics.Recall(name="recall"),
    tf.keras.metrics.AUC(name="auc"),
    tf.keras.metrics.BinaryAccuracy(name="binaryAcc")
]

checkpoint_cb = tf.keras.callbacks.ModelCheckpoint(
    "my_keras_model.h5", save_best_only=True)
lr_schedule = tf.keras.optimizers.schedules.ExponentialDecay(initial_learning_rate=1e-2,
                                                             decay_steps=10000,
                                                             decay_rate=0.9)


adam_optimizer = tf.keras.optimizers.Adam(learning_rate=lr_schedule)
model.compile(loss="categorical_crossentropy",
              optimizer="adam", metrics=METRICS)

training_history = model.fit(
    (bow_train),
    indus_cat_train,
    epochs=2000,
    batch_size=128,
    callbacks=[checkpoint_cb],
    validation_data=(bow_test, indus_cat_test))

Please help me to understand this behavior of TensorFlow. What is causing this error? I have read this and this but these do not seem to be a correct explanation in my case.

Culpa answered 30/7, 2020 at 9:46 Comment(1)
Have you found the Solution yet? I get the same errorLailaibach
A
22

I think that this error is due to the setting of the AUC metric.(see https://www.tensorflow.org/api_docs/python/tf/keras/metrics/AUC) The predictions should be all non-negative values instead of [-nan, -nan, ...] as your model output. You can try something from http://deeplearning.net/software/theano/tutorial/nan_tutorial.html to deal with the NANs. And, if you want to quickly solve this error, you can directly remove the AUC metric from the list.

Antiworld answered 21/9, 2020 at 4:20 Comment(1)
The second link is broken, can you provide the updated link ?Lailaibach
D
3

I had the exact same problem in my multilabel classification LSTM model. During tuning I found that the larger the learning rate, the more likely this error is to occur. Your specification of initial_learning_rate=1e-2 might already be too high for your problem. For my model, I experienced the following:

lr=0.1 -> error occurs always

lr=0.01-> error occurs very seldomly

lr=0.05-> error occurs never (until now)

These values are based solely on my observations during tuning with early stoppage, so I assume that for full training runs the risk of this error is actually higher. Also, the error seemed to be indepent of the neural net's topology.

The answer above by @awilliea states that the error is related to the AUC metric. I cannot say for sure if that is correct. But at least I can confirm that removing AUC and some other metrics as suggested would have worked for my problem, too. While testing my model with any learning rate and without these metrics, the error never occured. Yet, for most problems you need those metrics, so I suggest to solve the problem via the learning rate.

Diphosgene answered 27/1, 2023 at 21:12 Comment(0)
C
1

In your output Dense layer you have to set activation function to "softmax" as this is multi class classification problem.

Also metrics like "binaryAcc" and "AUC" won't work here as they are used specifically with binary classification only.

Couchant answered 11/8, 2022 at 10:0 Comment(0)
L
0

It's solved on tensorflow 2.10 https://github.com/keras-team/keras/issues/15715#issuecomment-1100795008

Lovell answered 27/2, 2023 at 16:20 Comment(1)
While this link may answer the question, it is better to include the essential parts of the answer here and provide the link for reference. Link-only answers can become invalid if the linked page changes. - From ReviewDigastric

© 2022 - 2024 — McMap. All rights reserved.