ValueError: Shapes (None, 1) and (None, 2) are incompatible
Asked Answered
F

8

66

I am training a facial expression (angry vs happy) model. Last dense output layer was previously 1 but when i predict an image it's output was always 1 with 64 % accuracy. So i changed it to 2 for 2 outputs. But now i am getting this error::

Epoch 1/15

---------------------------------------------------------------------------

ValueError                                Traceback (most recent call last)

<ipython-input-54-9c7272c38dcb> in <module>()
     11     epochs=epochs,
     12     validation_data = val_data_gen,
---> 13     validation_steps = validation_steps,
     14 
     15 )

10 frames

/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/func_graph.py in wrapper(*args, **kwargs)
    966           except Exception as e:  # pylint:disable=broad-except
    967             if hasattr(e, "ag_error_metadata"):
--> 968               raise e.ag_error_metadata.to_exception(e)
    969             else:
    970               raise

ValueError: in user code:

    /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:571 train_function  *
        outputs = self.distribute_strategy.run(
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/distribute/distribute_lib.py:951 run  **
        return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/distribute/distribute_lib.py:2290 call_for_each_replica
        return self._call_for_each_replica(fn, args, kwargs)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/distribute/distribute_lib.py:2649 _call_for_each_replica
        return fn(*args, **kwargs)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:533 train_step  **
        y, y_pred, sample_weight, regularization_losses=self.losses)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/compile_utils.py:205 __call__
        loss_value = loss_obj(y_t, y_p, sample_weight=sw)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/losses.py:143 __call__
        losses = self.call(y_true, y_pred)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/losses.py:246 call
        return self.fn(y_true, y_pred, **self._fn_kwargs)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/losses.py:1527 categorical_crossentropy
        return K.categorical_crossentropy(y_true, y_pred, from_logits=from_logits)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/backend.py:4561 categorical_crossentropy
        target.shape.assert_is_compatible_with(output.shape)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/tensor_shape.py:1117 assert_is_compatible_with
        raise ValueError("Shapes %s and %s are incompatible" % (self, other))

    ValueError: Shapes (None, 1) and (None, 2) are incompatible

The relevant code is :

    model = Sequential([
    Conv2D(32,3, activation='relu', input_shape=(48,48,1)),
    BatchNormalization(),
    MaxPooling2D(pool_size=(3, 3)),
  
    Flatten(),
    Dense(512, activation='relu'),
    Dense(2,activation='softmax')
])
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])


model.summary()

Model: "sequential_4"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_6 (Conv2D)            (None, 46, 46, 32)        320       
_________________________________________________________________
batch_normalization_4 (Batch (None, 46, 46, 32)        128       
_________________________________________________________________
max_pooling2d_6 (MaxPooling2 (None, 15, 15, 32)        0         
_________________________________________________________________
flatten_4 (Flatten)          (None, 7200)              0         
_________________________________________________________________
dense_8 (Dense)              (None, 512)               3686912   
_________________________________________________________________
dense_9 (Dense)              (None, 2)                 1026      
=================================================================
Total params: 3,688,386
Trainable params: 3,688,322
Non-trainable params: 64
_________________________________________________________________


epochs = 15
steps_per_epoch = train_data_gen.n//train_data_gen.batch_size
validation_steps = val_data_gen.n//val_data_gen.batch_size



history = model.fit(
    x=train_data_gen,
    steps_per_epoch=steps_per_epoch,
    epochs=epochs,
    validation_data = val_data_gen,
    validation_steps = validation_steps,
    
)
Fibril answered 12/5, 2020 at 2:7 Comment(7)
Well for one thing, if your output is binary you need to be using sigmoid for your final layer rather than softmax, and binary_crossentropySwiercz
@Swiercz i have replaced softmax with sigmoid again same errorFibril
Well did you adjust the parameter of the last Dense layer from 2 to one, since there's only one output variable?Swiercz
@Swiercz I did , the error is removed but again the prediction is always with the accuracy 60 %.Fibril
And you switched to binary_crossentropy for your loss correct?Swiercz
@Swiercz Oh no i didn't ,forgot about it. I just changed it to binary_crossentropy it works with a accuracy 90%. Thank you so much for helping. I am still new to keras.Fibril
Awesome, I'll add my answer below so you can mark the question as solved.Swiercz
S
75

Change Categorical Cross Entropy to Binary Cross Entropy since your output label is binary. Also Change Softmax to Sigmoid since Sigmoid is the proper activation function for binary data

Swiercz answered 12/5, 2020 at 2:52 Comment(5)
can you also add the part about using the correct activation function with explanation? The answer would be complete thenMetatherian
Great, upvoted, @faiza can you please accept this answer as this is the one which solved your error?Metatherian
loss='binary_crossentropy' activation='sigmoid'Trondheim
This leads me to another error: ValueError: logits and labels must have the same shape ((None, 1) vs (None, 762)), which is related to this SO questionTrondheim
Check the answer by @Muhammad Zakaria it solved the "logits and labels error"Speechless
L
87

i was facing the same problem my shapes were

shape of X (271, 64, 64, 3)
shape of y (271,)
shape of trainX (203, 64, 64, 3)
shape of trainY (203, 1)
shape of testX (68, 64, 64, 3)
shape of testY (68, 1)

and

loss="categorical_crossentropy"

i changed it to

loss="sparse_categorical_crossentropy"

and it worked like a charm for me

Leveille answered 7/3, 2021 at 4:0 Comment(1)
If you want to use "categorical_crossentropy", the labels should be one-hot-encoded. When your labels are given as an integer, changing to "sparse_categorical_crossentropy" is required. The advantage of using "categorical_crossentropy" is that it can give you class probabilities, which might be useful in some cases.Tsai
S
75

Change Categorical Cross Entropy to Binary Cross Entropy since your output label is binary. Also Change Softmax to Sigmoid since Sigmoid is the proper activation function for binary data

Swiercz answered 12/5, 2020 at 2:52 Comment(5)
can you also add the part about using the correct activation function with explanation? The answer would be complete thenMetatherian
Great, upvoted, @faiza can you please accept this answer as this is the one which solved your error?Metatherian
loss='binary_crossentropy' activation='sigmoid'Trondheim
This leads me to another error: ValueError: logits and labels must have the same shape ((None, 1) vs (None, 762)), which is related to this SO questionTrondheim
Check the answer by @Muhammad Zakaria it solved the "logits and labels error"Speechless
T
31

If your dataset was load with image_dataset_from_directory, use label_mode='categorical'

train_ds = tf.keras.preprocessing.image_dataset_from_directory(
  path,
  label_mode='categorical'
)

Or load with flow_from_directory, flow_from_dataframe then use class_mode='categorical'

train_ds = ImageDataGenerator.flow_from_directory(
  path,
  class_mode='categorical'
)
Teflon answered 9/10, 2020 at 7:21 Comment(3)
Above answer helped me. This answer and @mike answer are good combination to think on to solve this type of issue.Nymphalid
Take note that "categorial" -> "categorical" in first code snippet. The suggested edits queue is full.Barringer
the label_mode will not work it should be class_mode.Seminary
O
10

You can change the labels from binary values to categorical and continue with the same code. For example,

from keras.utils import to_categorical
one_hot_label = to_cateorical(input_labels)
# change to [1, 0, 0,..., 0]  --> [[0, 1], [1, 0], ..., [1, 0]]

You can go through this link to understand better Keras API.

If you want to use categorical crossentropy for two classes, use softmax and do one hot encoding. Either for binary classification, you can use binary crossentropy as in previous answer mentioned by using sigmoid activation function.

  1. Categorical Cross entropy:
model = Sequential([
    Conv2D(32,3, activation='relu', input_shape=(48,48,1)),
    BatchNormalization(),
    MaxPooling2D(pool_size=(3, 3)),

    Flatten(),
    Dense(512, activation='relu'),
    Dense(2,activation='softmax')  # activation change
])
model.compile(optimizer='adam',
              loss='categorical_crossentropy', # Loss
              metrics=['accuracy'])
  1. Binary Crossentropy
model = Sequential([
    Conv2D(32,3, activation='relu', input_shape=(48,48,1)),
    BatchNormalization(),
    MaxPooling2D(pool_size=(3, 3)),

    Flatten(),
    Dense(512, activation='relu'),
    Dense(1,activation='sigmoid') #activation change
])
model.compile(optimizer='adam',
              loss='binary_crossentropy', # Loss
              metrics=['accuracy'])

Outflank answered 12/5, 2020 at 8:49 Comment(2)
can you please help in this problem #68225832Urissa
this is the answer that worked for me. Should be accepted.Gatecrasher
A
7

Changing from 'categorical_crossentropy' to 'sparse_categorical_crossentropy' worked for me in case of multilabel classification

Antipater answered 2/1, 2022 at 9:3 Comment(2)
you mean sparse_categorical_crossentropy?Hygroscopic
why categorical_crossentropy does not work then?Liquid
A
3

Even I was facing the same problem I changed class_mode='categorical' instead of class_mode='binary' in flow_from_directory method that worked for me

Astrionics answered 23/6, 2020 at 4:58 Comment(0)
L
3

As @Akash pointed out, should convert your labels to one-hot encoded, like so:

y = keras.utils.to_categorical(y, num_classes=num_classes_in_your_case)
Lupe answered 6/6, 2021 at 18:15 Comment(0)
M
0

I encountered this problem myself and in my case, the problem was in the declaration of the model. I was trying to use VGG16 for transfer learning and I used the wrong layer in place of the output. Instead of using the prediction layer that I created, I used another layer. So look in your model if you misplaced any layer when you encounter this error.

Matrilocal answered 13/5, 2021 at 20:39 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.