How to resize a PyTorch tensor?

Asked 3/11, 2019 at 1:22 Answered 18/8, 2024 at 2:5

python computer-vision pytorch image-resizing tensor

I have a PyTorch tensor of size (5, 1, 44, 44) (batch, channel, height, width), and I want to 'resize' it to (5, 1, 224, 224)

How can I do that? What functions should I use?

Spinney answered 3/11, 2019 at 1:22 Comment(1)

How do you want to resize it? By padding with 0s? By dilating the image? – Wanderlust 7/4, 2021 at 11:35

It seems like you are looking for interpolate (a function in nn.functional):

import torch.nn.functional as nnf

x = torch.rand(5, 1, 44, 44)
out = nnf.interpolate(x, size=(224, 224), mode='bicubic', align_corners=False)

If you really care about the accuracy of the interpolation, you should have a look at ResizeRight: a pytorch/numpy package that accurately deals with all sorts of "edge cases" when resizing images. This can have an effect when directly merging features of different scales: inaccurate interpolation may result in misalignments.

Island answered 3/11, 2019 at 6:10 Comment(1)

just a word of warning about bicubic interpolation is that the range of the result may be wider than the range of the input. If this is important than you can use bilinear instead – Kennakennan 3/11, 2019 at 12:39

The TorchVision transforms.functional.resize() function is what you're looking for:

import torchvision.transforms.functional as F

t = torch.randn([5, 1, 44, 44])
t_resized = F.resize(t, 224)

If you wish to use another interpolation mode than bilinear, you can specify this with the interpolation argument.

Wanderlust answered 16/4, 2021 at 15:7 Comment(0)

Building on the first answer, you can get better results.

if you trying to increase the size of the image (Enlarging) to use it later in the deep learning model (your case)

(Linear interpolation is better than bicubic interpolation).

resized_tensor = F.interpolate(input_tensor, size=(224, 224), mode='bilinear', align_corners=False)

Since bilinear interpolation:

Faster than bicubic (you will use it with large dataset)
Uses 2x2 pixel neighborhood instead of 4x4, which will require less computation
slightly softer images compared to cubic

if you care about visual quality of the images use bicubic but note that it is slower and produce sharper images

Overlay answered 18/8, 2024 at 2:5 Comment(0)

Recommended topics

Hot tags