InvalidArgumentError: required broadcastable shapes at loc(unknown)
Asked Answered
K

6

16

Background

I am totally new to Python and to machine learning. I just tried to set up a UNet from code I found on the internet and wanted to adapt it to the case I'm working on bit for bit. When trying to .fit the UNet to the training data, I received the following error:

InvalidArgumentError:  required broadcastable shapes at loc(unknown)
     [[node Equal (defined at <ipython-input-68-f1422c6f17bb>:1) ]] [Op:__inference_train_function_3847]

I get a lot of results when I search for it, but mostly they are different errors.

What does this mean? And, more importantly, how can I fix it?

The code that caused the error

The context of this error is as follows: I want to segment images and label the different classes. I set up directories "trn", "tst" and "val" for training, test, and validation data. The dir_dat() function applies os.path.join() to get the full path to the respective data set. Each of the 3 folders has sub directories for each class labeled with integers. In each folder, there are some .tif images for the respective class.

I defined the following image data generators (training data is sparse, hence the augmentation):

classes = np.array([ 0,  2,  4,  6,  8, 11, 16, 21, 29, 30, 38, 39, 51])
bs = 15 # batch size

augGen = ks.preprocessing.image.ImageDataGenerator(rotation_range = 365,
                                                   width_shift_range = 0.05,
                                                   height_shift_range = 0.05,
                                                   horizontal_flip = True,
                                                   vertical_flip = True,
                                                   fill_mode = "nearest") \
    .flow_from_directory(directory = dir_dat("trn"),
                         classes = [str(x) for x in classes.tolist()],
                         class_mode = "categorical",
                         batch_size = bs, seed = 42)
    
tst_batches = ks.preprocessing.image.ImageDataGenerator() \
    .flow_from_directory(directory = dir_dat("tst"),
                         classes = [str(x) for x in classes.tolist()],
                         class_mode = "categorical",
                         batch_size = bs, shuffle = False)

val_batches = ks.preprocessing.image.ImageDataGenerator() \
    .flow_from_directory(directory = dir_dat("val"),
                         classes = [str(x) for x in classes.tolist()],
                         class_mode = "categorical",
                         batch_size = bs)

Then I set up the UNet based on this example. Here, I altered a few parameters to adapt the UNet to the situation (multiple classes), namely activation in the last layer and the loss function:

layer_in = ks.layers.Input(shape = (imgr, imgc, imgdim))
# convert pixel integer values to float
inVals = ks.layers.Lambda(lambda x: x / 255)(layer_in)

# Contraction path
c1 = ks.layers.Conv2D(16, (3, 3), activation = "relu",
                            kernel_initializer = "he_normal", padding = "same")(inVals)
c1 = ks.layers.Dropout(0.1)(c1)
c1 = ks.layers.Conv2D(16, (3, 3), activation = "relu",
                            kernel_initializer = "he_normal", padding = "same")(c1)
p1 = ks.layers.MaxPooling2D((2, 2))(c1)

c2 = ks.layers.Conv2D(32, (3, 3), activation = "relu",
                            kernel_initializer = "he_normal", padding = "same")(p1)
c2 = ks.layers.Dropout(0.1)(c2)
c2 = ks.layers.Conv2D(32, (3, 3), activation = "relu",
                            kernel_initializer = "he_normal", padding = "same")(c2)
p2 = ks.layers.MaxPooling2D((2, 2))(c2)
 
c3 = ks.layers.Conv2D(64, (3, 3), activation = "relu",
                            kernel_initializer = "he_normal", padding = "same")(p2)
c3 = ks.layers.Dropout(0.2)(c3)
c3 = ks.layers.Conv2D(64, (3, 3), activation = "relu",
                            kernel_initializer = "he_normal", padding = "same")(c3)
p3 = ks.layers.MaxPooling2D((2, 2))(c3)
 
c4 = ks.layers.Conv2D(128, (3, 3), activation = "relu",
                            kernel_initializer = "he_normal", padding = "same")(p3)
c4 = ks.layers.Dropout(0.2)(c4)
c4 = ks.layers.Conv2D(128, (3, 3), activation = "relu",
                            kernel_initializer = "he_normal", padding = "same")(c4)
p4 = ks.layers.MaxPooling2D(pool_size = (2, 2))(c4)
 
c5 = ks.layers.Conv2D(256, (3, 3), activation = "relu",
                            kernel_initializer = "he_normal", padding = "same")(p4)
c5 = ks.layers.Dropout(0.3)(c5)
c5 = ks.layers.Conv2D(256, (3, 3), activation = "relu",
                            kernel_initializer = "he_normal", padding = "same")(c5)

# Expansive path 
u6 = ks.layers.Conv2DTranspose(128, (2, 2), strides = (2, 2), padding = "same")(c5)
u6 = ks.layers.concatenate([u6, c4])
c6 = ks.layers.Conv2D(128, (3, 3), activation = "relu",
                            kernel_initializer = "he_normal", padding = "same")(u6)
c6 = ks.layers.Dropout(0.2)(c6)
c6 = ks.layers.Conv2D(128, (3, 3), activation = "relu",
                            kernel_initializer = "he_normal", padding = "same")(c6)
 
u7 = ks.layers.Conv2DTranspose(64, (2, 2), strides = (2, 2), padding = "same")(c6)
u7 = ks.layers.concatenate([u7, c3])
c7 = ks.layers.Conv2D(64, (3, 3), activation = "relu",
                            kernel_initializer = "he_normal", padding = "same")(u7)
c7 = ks.layers.Dropout(0.2)(c7)
c7 = ks.layers.Conv2D(64, (3, 3), activation = "relu",
                            kernel_initializer = "he_normal", padding = "same")(c7)
 
u8 = ks.layers.Conv2DTranspose(32, (2, 2), strides = (2, 2), padding = "same")(c7)
u8 = ks.layers.concatenate([u8, c2])
c8 = ks.layers.Conv2D(32, (3, 3), activation = "relu",
                            kernel_initializer = "he_normal", padding = "same")(u8)
c8 = ks.layers.Dropout(0.1)(c8)
c8 = ks.layers.Conv2D(32, (3, 3), activation = "relu",
                            kernel_initializer = "he_normal", padding = "same")(c8)
 
u9 = ks.layers.Conv2DTranspose(16, (2, 2), strides = (2, 2), padding = "same")(c8)
u9 = ks.layers.concatenate([u9, c1], axis = 3)
c9 = ks.layers.Conv2D(16, (3, 3), activation = "relu",
                            kernel_initializer = "he_normal", padding = "same")(u9)
c9 = ks.layers.Dropout(0.1)(c9)
c9 = ks.layers.Conv2D(16, (3, 3), activation = "relu",
                            kernel_initializer = "he_normal", padding = "same")(c9)
 
out = ks.layers.Conv2D(1, (1, 1), activation = "softmax")(c9)
 
model = ks.Model(inputs = layer_in, outputs = out)
model.compile(optimizer = "adam", loss = "sparse_categorical_crossentropy", metrics = ["accuracy"])
model.summary()

Finally, I defined callbacks and ran the training, which produced the error:

cllbs = [
    ks.callbacks.EarlyStopping(patience = 4),
    ks.callbacks.ModelCheckpoint(dir_out("Checkpoint.h5"), save_best_only = True),
    ks.callbacks.TensorBoard(log_dir = './logs'),# log events for TensorBoard
    ]

model.fit(augGen, epochs = 5, validation_data = val_batches, callbacks = cllbs)

Full console output

This is the full output when running the last line (in case it helps solving the issue):

trained = model.fit(augGen, epochs = 5, validation_data = val_batches, callbacks = cllbs)
Epoch 1/5
Traceback (most recent call last):

  File "<ipython-input-68-f1422c6f17bb>", line 1, in <module>
    trained = model.fit(augGen, epochs = 5, validation_data = val_batches, callbacks = cllbs)

  File "c:\users\manuel\python\lib\site-packages\tensorflow\python\keras\engine\training.py", line 1183, in fit
    tmp_logs = self.train_function(iterator)

  File "c:\users\manuel\python\lib\site-packages\tensorflow\python\eager\def_function.py", line 889, in __call__
    result = self._call(*args, **kwds)

  File "c:\users\manuel\python\lib\site-packages\tensorflow\python\eager\def_function.py", line 950, in _call
    return self._stateless_fn(*args, **kwds)

  File "c:\users\manuel\python\lib\site-packages\tensorflow\python\eager\function.py", line 3023, in __call__
    return graph_function._call_flat(

  File "c:\users\manuel\python\lib\site-packages\tensorflow\python\eager\function.py", line 1960, in _call_flat
    return self._build_call_outputs(self._inference_function.call(

  File "c:\users\manuel\python\lib\site-packages\tensorflow\python\eager\function.py", line 591, in call
    outputs = execute.execute(

  File "c:\users\manuel\python\lib\site-packages\tensorflow\python\eager\execute.py", line 59, in quick_execute
    tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,

InvalidArgumentError:  required broadcastable shapes at loc(unknown)
     [[node Equal (defined at <ipython-input-68-f1422c6f17bb>:1) ]] [Op:__inference_train_function_3847]

Function call stack:
train_function
Kerley answered 16/5, 2021 at 13:59 Comment(0)
K
7

I found several issues here. The model was intended to be used for semantic segmentation with several classes (this is why I had changed the output layer activation to "softmax" and set "sparse_categorical_crossentropy" loss). Hence, in the ImageDataGenerators, class_mode has to be set to None. classes are not to be provided. Instead, I needed to insert the manually classified images as y. I guess beginners make a lot of beginner mistakes.

Kerley answered 29/5, 2021 at 9:7 Comment(0)
I
18

I faced this problem when the number of Class Labels did not match with the shape of the Output Layer's output shape.

For example, if there are 10 Class Labels and we have defined the Output Layer as:

output = tf.keras.layers.Conv2D(5, (1, 1), activation = "softmax")(c9)

As the number of Class Labels (10) is not equal to the Output shape (5). Then, we will get this error.

Ensure that the number of class labels matches with the Output Layer's output shape.

Imitation answered 13/8, 2021 at 5:19 Comment(0)
K
7

I found several issues here. The model was intended to be used for semantic segmentation with several classes (this is why I had changed the output layer activation to "softmax" and set "sparse_categorical_crossentropy" loss). Hence, in the ImageDataGenerators, class_mode has to be set to None. classes are not to be provided. Instead, I needed to insert the manually classified images as y. I guess beginners make a lot of beginner mistakes.

Kerley answered 29/5, 2021 at 9:7 Comment(0)
N
5

Just add the Flatten() layer before the Fully Connected layers.

Nayarit answered 23/5, 2022 at 10:46 Comment(3)
As it’s currently written, your answer is unclear. Please edit to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers in the help center.Drought
Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.Stubborn
save me a lot timeWavawave
F
3

I got the same issue because I used a number of n_classes in the model (for the output layer) different from the actual number of classes in the labels/masks array. I see you have a similar issue here: you have 13 classes, but your output layer is given only 1. The best way is to avoid hard-coding the number of classes, and only pass a variable (like n_classes) in the model, then declare this variable before calling the model. For instance n_classes = y_Train.shape[-1] or n_classes = len(np.unique(y_Train))

Fenland answered 3/11, 2021 at 7:9 Comment(0)
J
2

Try to check whether ks.layers.concatenate layers' inputs are of equal dimension. For example ks.layers.concatenate([u7, c3]), here check u7 and c3 tensors are of same shape to be concatenated except the axis input to the function ks.layers.concatenate. Axis = -1 default, that's the last dimension. To illustrate if you are giving ks.layers.concatenate([u7,c3],axis=0), then except the first axis of both u7 and c3 all other axes' dimension should match exactly, example, u7.shape = [3,4,5], c3.shape = [6,4,5].

Jemmy answered 29/5, 2021 at 8:35 Comment(0)
S
0

Even I was facing the same error, please check all your Concat layers and multiplication layers, or any layer where there is a similar operation happening.

This error is visible when there is a size mismatch and the model is not able to perform that operation.

Seville answered 31/1, 2023 at 9:35 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.