"RuntimeError: Expected 4-dimensional input for 4-dimensional weight 32 3 3, but got 3-dimensional input of size [3, 224, 224] instead"?

Asked 28/7, 2019 at 1:52 Answered 6/12, 2021 at 10:53

Solved python machine-learning pytorch computer-vision conv-neural-network

I am trying to use a pre-trained model. Here's where the problem occurs

Isn't the model supposed to take in a simple colored image? Why is it expecting a 4-dimensional input?

RuntimeError                              Traceback (most recent call last)
<ipython-input-51-d7abe3ef1355> in <module>()
     33 
     34 # Forward pass the data through the model
---> 35 output = model(data)
     36 init_pred = output.max(1, keepdim=True)[1] # get the index of the max log-probability
     37 

5 frames
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py in forward(self, input)
    336                             _pair(0), self.dilation, self.groups)
    337         return F.conv2d(input, self.weight, self.bias, self.stride,
--> 338                         self.padding, self.dilation, self.groups)
    339 
    340 

RuntimeError: Expected 4-dimensional input for 4-dimensional weight 32 3 3, but got 3-dimensional input of size [3, 224, 224] instead

Where

inception = models.inception_v3()
model = inception.to(device)

Jeanniejeannine answered 28/7, 2019 at 1:52 Comment(1)

A torch model normally expects a batch of images as input. If you want to pass a single image, make sure it is still a batch of single image. Also, Inception-v3 expects image dimensions to be 3X229X229 unlike other torch models which expect 3X224X224. – Cauliflower 28/7, 2019 at 4:8

As Usman Ali wrote in his comment, pytorch (and most other DL toolboxes) expects a batch of images as an input. Thus you need to call

output = model(data[None, ...])

Inserting a singleton "batch" dimension to your input data.

Please also note that the model you are using might expect a different input size (3x229x229) and not 3x224x224.

Geri answered 28/7, 2019 at 5:2 Comment(3)

I also had to add data[None, ...].float() to make it work – Vories 19/4, 2020 at 5:22

@Vories you should look at .to(...) to move/cast your input tensor into the right data type/device as expected from your model. – Geri 19/4, 2020 at 5:53

The conversion .to(device) was needed as the input image was loaded using another mean (most likely with PIL from a WebDataSet). The value of devicecan be set as follows: device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu"). – Demonize 29/9, 2021 at 19:28

From the Pytorch documentation on convolutional layers, Conv2d layers expect input with the shape

(n_samples, channels, height, width) # e.g., (1000, 1, 224, 224)

Passing grayscale images in their usual format (224, 224) won't work.

To get the right shape, you will need to add a channel dimension. You can do it as follows:

x = np.expand_dims(x, 1)      # if numpy array
tensor = tensor.unsqueeze(1)  # if torch tensor

The unsqueeze() method adds a dimensions at the specified index. The result would have shape:

(1000, 1, 224, 224)

Creighton answered 2/1, 2020 at 15:38 Comment(3)

For grayscale images, you are right. However, for an RGB image which needs to be seen as a batch of 1 image, that would be .unsqueeze(0). – Demonize 29/9, 2021 at 19:26

Can you explain n_samples here? – Kyongkyoto 3/10, 2021 at 11:29

It's the number of training data, like the number of images – Creighton 3/10, 2021 at 16:30

As the model expects a batch of images, we need to pass a 4 dimensional tensor, which can be done as follows:

Method-1: output = model(data[0:1])
Method-2: output = model(data[0].unsqueeze(0))

This will only send the first image of the whole batch.

Similarly for ith image we can do:

Method-1: output = model(data[i:i+1])
Method-2: output = model(data[i].unsqueeze(0))

Stupe answered 6/12, 2021 at 10:53 Comment(0)

Recommended topics

Hot tags