How to loop over a matlab image without making copies?
Asked Answered
A

2

1

I was trying to loop over an image in MATLAB but my code is running to slow. I am fairly knew to MATLAB but I suspect that it's because it's making a copy of my randomly selected image. My code is:

function patches = sampleIMAGES()
load IMAGES;    % load images from disk 
patchsize = 8;  % we'll use 8x8 patches 
numpatches = 10000;
patches = zeros(patchsize*patchsize, numpatches); 

size_img = size(IMAGES);
num_rows_img = size_img(1);
num_cols_img = size_img(2);
num_images = size_img(3);
for i=1:numpatches,
    %get random image
    rand_img_number = randi(num_images);
    rand_img = IMAGES(:, :, rand_img_number);
    %get random patch patchsizexpatchsize
    rand_row = randi(num_rows_img - patchsize);
    rand_col = randi(num_cols_img - patchsize);
    rand_patch = rand_img(rand_row:rand_row+patchsize-1, rand_col:rand_col+patchsize-1);
    patches(:, i) = rand_patch(:)';
end
end

How is it possible to loop over this without making a copy if MATLAB does not allow to index twice into a matrix/array?

Andrel answered 8/9, 2014 at 4:39 Comment(3)
The copying is actually the fastest part in your code. Your code is running slow not because of the image copying at the beginning, but it's due to you generating 10000 patches. Did you try reducing the number of patches to 100, or 50? How fast does it run then? Did you also consider that the loading of the images from disk also takes a long time as well? You should try profiling your code and seeing which parts of your code take the longest. Type in profile viewer in your MATLAB command prompt, then run the sampleIMAGES function to see a timing breakdown for each line of your code.Snakemouth
Also, the random number generation could probably be removed from the loop and allocated all at once (the seed should also be set using rng). If you're worried about copies, you can try removing the intermediate variable rand_img.Syzygy
Curious - Did any of the solutions provided here work for you? Was load IMAGES was the real bottleneck in your case?Strife
S
3

Approach #1 - im2col based

numpatches = 10000; %//Number of patches
blksz = 8; %// Blocksize

[m,n,r] = size(IMAGES); %// Get sizes

%// Store blocks from IMAGES as columns, so that they could be processed in
%// a vectorized fashion later on
blks_col(blksz*blksz,(m-blksz+1)*(n-blksz+1),r)=0; %// Pre-allocate
for k1=1:r
    blks_col(:,:,k1) = im2col(IMAGES(:,:,k1),[blksz blksz],'sliding');
end
blks_col = reshape(blks_col,size(blks_col,1),[]);

%// Get rand row, column and dimension-3 indices to be used for indexing
%// into blks_col in one go
rand_row = randi(size(IMAGES,1)-blksz+1,numpatches,1);
rand_col = randi(size(IMAGES,2)-blksz+1,numpatches,1);
rand_dim3 = randi(size(IMAGES,3),numpatches,1);

%// Select the specific column from blks_col that represents the 
%// [blksz x blksz] used to make a single patch in each iteration from 
%// original code 
num_cols_im2col = (m-blksz+1)*(n-blksz+1);
col_ind = (rand_dim3-1)*num_cols_im2col + (rand_col-1)*(m-blksz+1) + rand_row;
patches = blks_col(:,col_ind);

Example

As an example I assumed IMAGES as the 3D data obtained from reading one of the images provided in the image gallery of Image Processing Toolbox and increased the number of patches to 100000, i.e. -

IMAGES = imread('peppers.png');
numpatches = 100000;

The runtime with original code - 22.376446 seconds.

The runtime with im2col based code - 2.237993 seconds

Then, I doubled the number of patches to 200000, for which the runtime with original code literally doubled and im2col based approach's runtime stayed around that ~2.3 sec mark.

Thus, this im2col based approach would make sense when you are working with lots of patches as opposed to when working with lots of images (that are put in the third dimension of IMAGES).


Approach #2 - Indexing based

Being a purely indexing based approach, this is expected to be memory-efficient and good with performance too.

numpatches = 10000; %//Number of patches
blksz = 8; %// Blocksize
[m,n,r] = size(IMAGES); %// Get sizes

%// Get rand row, column and dimension-3 indices to be used for indexing
rand_row = randi(size(IMAGES,1)-blksz+1,numpatches,1);
rand_col = randi(size(IMAGES,2)-blksz+1,numpatches,1);
rand_dim3 = randi(size(IMAGES,3),numpatches,1);

%// Starting indices for each patch
start_ind = (rand_dim3-1)*m*n + (rand_col-1)*m + rand_row;

%// Row indices for each patch
lin_row = permute(bsxfun(@plus,start_ind,[0:blksz-1])',[1 3 2]);   %//'

%// Get linear indices based on row and col indices
lin_rowcol = reshape(bsxfun(@plus,lin_row,[0:blksz-1]*m),blksz*blksz,[]);

%// Finally get the patches
patches = IMAGES(lin_rowcol);
Strife answered 8/9, 2014 at 5:55 Comment(0)
W
2

To no longer copy the images, instead of these two lines:

rand_img = IMAGES(:, :, rand_img_number);
rand_patch = rand_img(rand_row:rand_row+patchsize-1, rand_col:rand_col+patchsize-1);

combine both to one line:

rand_patch = IMAGES(rand_row:rand_row+patchsize-1, rand_col:rand_col+patchsize-1, rand_img_number);

Another way to improve the performance: Generating 100 random numbers at onece is faster than generating 1 number 100 times. Generate all numbers you need outside the loop:

rand_img_number = randi(num_images,numpatches,1);

Then use rand_img_number(i) instead of rand_img_number inside the loop. Do the same for the two other random numbers.

Whirlwind answered 8/9, 2014 at 5:32 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.