Tensorflow : logits and labels must have the same first dimension
Asked Answered
B

7

44

I am new in tensoflow and I want to adapt the MNIST tutorial https://www.tensorflow.org/tutorials/layers with my own data (images of 40x40). This is my model function :

def cnn_model_fn(features, labels, mode):
        # Input Layer
        input_layer = tf.reshape(features, [-1, 40, 40, 1])

        # Convolutional Layer #1
        conv1 = tf.layers.conv2d(
                inputs=input_layer,
                filters=32,
                kernel_size=[5, 5],
                #  To specify that the output tensor should have the same width and height values as the input tensor
                # value can be "same" ou "valid"
                padding="same",
                activation=tf.nn.relu)

        # Pooling Layer #1
        pool1 = tf.layers.max_pooling2d(inputs=conv1, pool_size=[2, 2], strides=2)

        # Convolutional Layer #2 and Pooling Layer #2
        conv2 = tf.layers.conv2d(
                inputs=pool1,
                filters=64,
                kernel_size=[5, 5],
                padding="same",
                activation=tf.nn.relu)
        pool2 = tf.layers.max_pooling2d(inputs=conv2, pool_size=[2, 2], strides=2)

        # Dense Layer
        pool2_flat = tf.reshape(pool2, [-1, 10 * 10 * 64])
        dense = tf.layers.dense(inputs=pool2_flat, units=1024, activation=tf.nn.relu)
        dropout = tf.layers.dropout(
                inputs=dense, rate=0.4, training=mode == tf.estimator.ModeKeys.TRAIN)

        # Logits Layer
        logits = tf.layers.dense(inputs=dropout, units=2)

        predictions = {
            # Generate predictions (for PREDICT and EVAL mode)
            "classes":       tf.argmax(input=logits, axis=1),
            # Add `softmax_tensor` to the graph. It is used for PREDICT and by the
            # `logging_hook`.
            "probabilities": tf.nn.softmax(logits, name="softmax_tensor")
        }

        if mode == tf.estimator.ModeKeys.PREDICT:
            return tf.estimator.EstimatorSpec(mode=mode, predictions=predictions)

        # Calculate Loss (for both TRAIN and EVAL modes)
        loss = tf.losses.sparse_softmax_cross_entropy(labels=labels, logits=logits)

        # Configure the Training Op (for TRAIN mode)
        if mode == tf.estimator.ModeKeys.TRAIN:
            optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.001)
            train_op = optimizer.minimize(
                    loss=loss,
                    global_step=tf.train.get_global_step())
            return tf.estimator.EstimatorSpec(mode=mode, loss=loss, train_op=train_op)

        # Add evaluation metrics (for EVAL mode)
        eval_metric_ops = {
            "accuracy": tf.metrics.accuracy(
                    labels=labels, predictions=predictions["classes"])}
        return tf.estimator.EstimatorSpec(
                mode=mode, loss=loss, eval_metric_ops=eval_metric_ops)

I have a shape size error between labels and logits :

InvalidArgumentError (see above for traceback): logits and labels must have the same first dimension, got logits shape [3,2] and labels shape [1]

filenames_array is an array of 16 string

["file1.png", "file2.png", "file3.png", ...]

and labels_array is an array of 16 integer

[0,0,1,1,0,1,0,0,0,...]

The main function is :

# Create the Estimator
mnist_classifier = tf.estimator.Estimator(model_fn=cnn_model_fn, model_dir="/tmp/test_convnet_model")

# Train the model
cust_train_input_fn = lambda: train_input_fn_custom(
        filenames_array=filenames, labels_array=labels, batch_size=1)

mnist_classifier.train(
        input_fn=cust_train_input_fn,
        steps=20000,
        hooks=[logging_hook])

I tried to reshape logits without success :

logits = tf.reshape(logits, [1, 2])

I need your help, thanks


EDIT

After more time to search, in the first line of my model function

input_layer = tf.reshape(features, [-1, 40, 40, 1])

the "-1" that signifies that the batch_size dimension will be dynamically calculated have here the value "3". The same "3" as in my error : logits and labels must have the same first dimension, got logits shape [3,2] and labels shape [1]

If I force the value to "1" I have this new error :

Input to reshape is a tensor with 4800 values, but the requested shape has 1600

Maybe a problem with my features ?


EDIT2 :

the complete code is here : https://gist.github.com/geoffreyp/cc8e97aab1bff4d39e10001118c6322e


EDIT3

I updated the gist with

logits = tf.layers.dense(inputs=dropout, units=1)

https://gist.github.com/geoffreyp/cc8e97aab1bff4d39e10001118c6322e

But I don't completely understand your answer about the batch size, how the batch size can be 3 here whereas I choose a batch size of 1 ?

If I choose a batch_size = 3 I have this error : logits and labels must have the same first dimension, got logits shape [9,1] and labels shape [3]

I tried to reshape labels :

labels = tf.reshape(labels, [3, 1])

and I updated features and labels structure :

    filenames_train = [['blackcorner-data/1.png', 'blackcorner-data/2.png', 'blackcorner-data/3.png',
                   'blackcorner-data/4.png', 'blackcorner-data/n1.png'],
                   ['blackcorner-data/n2.png',
                   'blackcorner-data/n3.png', 'blackcorner-data/n4.png',
                   'blackcorner-data/11.png', 'blackcorner-data/21.png'],
                   ['blackcorner-data/31.png',
                   'blackcorner-data/41.png', 'blackcorner-data/n11.png', 'blackcorner-data/n21.png',
                   'blackcorner-data/n31.png']
                   ]

labels = [[0, 0, 0, 0, 1], [1, 1, 1, 0, 0], [0, 0, 1, 1, 1]]

but without success...

Biliary answered 7/3, 2018 at 21:5 Comment(0)
N
86

The problem is in your target shape and is related to the correct choice of an appropriate loss function. you have 2 possibilities:

1. possibility: if you have 1D integer encoded target, you can use sparse_categorical_crossentropy as loss function

n_class = 3
n_features = 100
n_sample = 1000

X = np.random.randint(0,10, (n_sample,n_features))
y = np.random.randint(0,n_class, n_sample)

inp = Input((n_features,))
x = Dense(128, activation='relu')(inp)
out = Dense(n_class, activation='softmax')(x)

model = Model(inp, out)
model.compile(loss='sparse_categorical_crossentropy',optimizer='adam',metrics=['accuracy'])
history = model.fit(X, y, epochs=3)

2. possibility: if you have one-hot encoded your target in order to have 2D shape (n_samples, n_class), you can use categorical_crossentropy

n_class = 3
n_features = 100
n_sample = 1000

X = np.random.randint(0,10, (n_sample,n_features))
y = pd.get_dummies(np.random.randint(0,n_class, n_sample)).values

inp = Input((n_features,))
x = Dense(128, activation='relu')(inp)
out = Dense(n_class, activation='softmax')(x)

model = Model(inp, out)
model.compile(loss='categorical_crossentropy',optimizer='adam',metrics=['accuracy'])
history = model.fit(X, y, epochs=3)
Nodab answered 9/6, 2020 at 16:4 Comment(0)
H
53

I resolved it changing from sparse_categorical_crossentropy to categorical_crossentropy and is now running fine.

Hann answered 7/5, 2020 at 9:17 Comment(3)
This helped me. I also don't know why.Idocrase
Here the explanation: #49161674Nodab
This solved my problem too. Now, I need to understand why.Disorderly
C
12

I already had this problem in my first time using tensorflow, i figured out that my problem was forgetting to add the attribute class_mode='sparse' / class_mode='binary' to the function that uploads tha training data and validation data :

So try to look after the class_mode option

image_gen_val = ImageDataGenerator(rescale=1./255)
val_data_gen = image_gen_val.flow_from_directory(batch_size=batch_size,
                                                 directory=val_dir,
                                                 target_size=(IMG_SHAPE, IMG_SHAPE),
                                                 class_mode='sparse')
Canaliculus answered 30/5, 2020 at 13:18 Comment(0)
A
9

I thought I'd provide a more illustrative answer of the recommended methods for different label representations + some insights to what's happening.

First of all some context. We have 3 data points and 5 possible labels (0-indexed) Here are the different types of labels you'll encounter in ML problems.

enter image description here

Code (tensorflow2)

Say we have the following dummy data and the model

import tensorflow as tf
import numpy as np

ohe_labels = np.array([[0, 0, 0, 1, 0], [0, 1, 0, 0, 0], [0, 0, 0, 0, 1]])
labels = np.argmax(ohe_labels, axis=-1)

x = np.random.normal(size=(3, 10))

model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Dense(20, 'relu', input_shape=(10,)),
        tf.keras.layers.Dense(5, 'softmax')
    ]
)
# This works!

model.compile(loss='categorical_crossentropy', optimizer='adam')
model.fit(x, ohe_labels)

# This also works!
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam')
model.fit(x, labels)

# This does NOT (Different error - ValueError: Shapes ... are incompatible)!
model.compile(loss='categorical_crossentropy', optimizer='adam')
model.fit(x, labels)

# This does NOT (Gives the above error)!
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam')
model.fit(x, ohe_labels)

When is this error triggered?

This error is triggered in a special condition. Let me explain with a problem that has a single label per input (explanation applies for multi label settings as well - but need few more added details). The first (batch) dimension needs to match but once labels is reshaped to a 1D vector, if the first dimension of logits and length of labels don't match, this error will be triggered.

Andalusia answered 5/8, 2022 at 22:31 Comment(1)
this is a great explanationLives
S
3

I have faced a similar issue and found out that I have missed the flatten layer between CNN and Dense layers. Adding the flatten layer resolved this issue for me.

Solomon answered 1/8, 2021 at 23:59 Comment(1)
There are some very good and detailed explanations here, but this is what solved for me. Thanks!Brough
B
1

Your logits shape looks right, batch size of 3, and output layer of size 2, which is what you defined as your output layer. Your labels should be shape [3, 2] also. Batch of 3, and each batch has 2 [1,0] or [0,1].

Also note that when you have a boolean classification output you shouldn't have 2 neurons on the output/logits layer. You can just output a single value which takes on 0 or 1, you can probably see how 2 outputs of [1,0], and [0,1] is redundant and can be expressed as a simple [0|1] value. Also you tend to get better results when you do it this way.

Thus, your logits should end up being [3,1] and your labels should be an array of 3 values, one for each of the samples in your batch.

Bertero answered 9/3, 2018 at 21:18 Comment(0)
I
1

I had a similar problem and it turned out that one pooling layer was not reshaped correctly. I was incorrectly using in my case tf.reshape(pool, shape=[-1, 64 * 7 * 7]) instead of tf.reshape(pool, shape=[-1, 64 * 14 * 14]), which lead to a similar error massage about logits and labels. Altering the factors, e.g. tf.reshape(pool, shape=[-1, 64 * 12 * 12]) lead to a completely different, less misleading error message.

Perhaps this is also the case here. I recommend going through the code checking the shapes of the nodes, just in case.

Illness answered 20/3, 2018 at 19:17 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.