How to change the picture size in PyTorch
Asked Answered
K

3

10

I'm trying to convert CNN Keras model for Emotion Recognition using FER2013 dataset to PyTorch model and I have following error:

Traceback (most recent call last):
  File "VGG.py", line 112, in <module>
    transfer.keras_to_pytorch(keras_network, pytorch_network)
  File "/home/eorg/NeuralNetworks/user/Project/model/nntransfer.py", line 121, in keras_to_pytorch
    pytorch_model.load_state_dict(state_dict)
  File "/home/eorg/.local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 334, in load_state_dict
    own_state[name].copy_(param)
RuntimeError: inconsistent tensor size at /b/wheel/pytorch-src/torch/lib/TH/generic/THTensorCopy.c:51

I understood that the error is related to the shape of images. In Keras the input size is defined to be 48 by 48.

And my question is how to define in PyTorch models that of my pictures are the shape of 48x48? I couldn't find such function in the documentation and examples.

Any help would be useful!

Korikorie answered 8/11, 2017 at 14:9 Comment(1)
if this line pytorch_model.load_state_dict(state_dict) gives you the error, then the problem is that the parameters in your saved state dict do not match the parameters in pytorch_model.Siphon
S
17

In order to automatically resize your input images you need to define a preprocessing pipeline all your images go through. This can be done with torchvision.transforms.Compose() (Compose docs). To resize Images you can use torchvision.transforms.Scale() (Scale docs) from the torchvision package.

See the documentation: Note, in the documentation it says that .Scale() is deprecated and .Resize() should be used instead. Resize docs

This would be a minimal working example:

import torch
from torchvision import transforms

p = transforms.Compose([transforms.Scale((48,48))])

from PIL import Image

img = Image.open('img.jpg')

img.size
# (224, 224) <-- This will be the original dimensions of your image

p(img).size
# (48, 48) <-- This will be the rescaled/resized dimensions of your image
Siphon answered 8/11, 2017 at 17:23 Comment(2)
Is this really the awnser to the question? I think he/she is struggling with setting up the image input size of the model, not the imagesize itself.Badge
The document link doesn't work. Scale seems deprecated.Thrush
T
1

1)

If you are using transform you can simply use resize. For example, this code will convert MNIST dataloading into a 32*32 shape (in the resize line)

train_loader = torch.utils.data.DataLoader(
  torchvision.datasets.MNIST('/files/', train=True, download=True,
                             transform=torchvision.transforms.Compose([
                               torchvision.transforms.ToTensor(),
                               torchvision.transforms.Resize(32), # This line
                               torchvision.transforms.Normalize(
                                 (0.1307,), (0.3081,))
                             ])),
  batch_size=batch_size_train, shuffle=True)

2)

If you only want a function you can use torchvision.transforms.functional.resize(image, size, ...). The other answers seem deprecated.

In the Resize Docs is written

Resize the input image to the given size.

Parameters:

  • img (PIL Image or Tensor) – Image to be resized.
  • size Desired output size. If size is a sequence like (h, w), the output size will be matched to this. If size is an int, the smaller edge of the image will be matched to this number maintaining the aspect ratio

Return type:

  • PIL Image or Tensor
Thrush answered 16/11, 2022 at 23:16 Comment(0)
C
0

It would help if code which you have tried is also given side by side. Answer given by @blckbird seems to be correct (i.e., at some point you need to transform the data).

Now instead of Scale, Resize needs to be used.

So suppose data has batch size of 64 and has 3 channels and of size 128x128 and you need to convert it to 64x3x48x48 then following code should do it

trans = transforms.Compose([transforms.Resize(48)])
tData = trans(data)

Also if channels and batch needs to be shuffled than use permute. For example to bring channel to the end do:

pData = tData.permute([0, 2, 3, 1])
Counterforce answered 4/4, 2021 at 22:54 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.