Random cropping data augmentation convolutional neural networks
Asked Answered
R

3

9

I am training a convolutional neural network, but have a relatively small dataset. So I am implementing techniques to augment it. Now this is the first time i am working on a core computer vision problem so am relatively new to it. For augmenting, i read many techniques and one of them that is mentioned a lot in the papers is random cropping. Now i'm trying to implement it ,i've searched a lot about this technique but couldn't find a proper explanation. So had a few queries:

How is random cropping actually helping in data augmentation? Is there any library (e.g OpenCV, PIL, scikit-image, scipy) in python implementing random cropping implicitly? If not, how should i implement it?

Revealment answered 3/1, 2016 at 8:34 Comment(0)
M
12

In my opinion the reason random cropping helps data augmentation is that while the semantics of the image are preserved (unless you pick out a really bad crop, but let's assume that you setup your random cropping so that this is very low probability) the activations values you get in your conv net are different. So in effect our conv net learns to associate a broader range of spatial activation statistics with a certain class label and thus data augmentation via random cropping helps improve the robustness of our feature detectors in conv nets. Also in the same vein, the random crop produces different intermediate activation values and produces a different forwardpass so it's like a "new training point."

It's also not trivial. See the recent work on adversarial examples in neural networks (relatively shallow to AlexNet sized). Images that semantically look the same, more or less, when we pass them through a neural net with a softmax classifier on top, we can get drastically different class probabilities. So subtle changes from a semantic point of view can end up having different forward passes through a conv net. For more details see Intriguing properties of neural networks.

To answer the last part of your question: I usually just make my own random cropping script. Say my images are (3, 256, 256) (3 RGB channels, 256x256 spatial size) you can code up a loop which takes 224x224 random crops of your image by just randomly selecting a valid corner point. So I typically compute an array of valid corner points and if I want to take 10 random crops, I randomly select 10 different corner points from this set, say I choose (x0, y0) for my upper left hand corner point, I will select the crop X[x0:x0+224, y0:y0+224], something like this. I personally like to randomly choose from a pre-computed set of valid corner points instead of randomly choosing a corner one draw at a time because this way I guarantee I do not get a duplicate crop, though in reality it's probably low probability anyway.

Magritte answered 3/1, 2016 at 14:43 Comment(9)
Hi. Thanks. Also just wanted to ask, you said that you crop 256x256 image to 224x224 for augmentation picking some important points. But Conv-nets usually take fixed sized inputs. So say my image is 125x138 and the conv-net i am using takes 224x224. How to do random cropping here? Do i have to resize the whole image to 256x256 and then randomly crop? Doesn't that effect the classification?Revealment
Say I have image data size 125x138. And let's say I want to do data augmentation and take 96x128 random crops. Then I would parameterize my conv net to have input size 96x128. So the conv net input size, which is something we choose, is typically set to the crop size. The random crops and data augmentation in general is a pre-processing step, so we typically do this before we configure our classifier.Magritte
Thanks. But what if one image comes to be smaller(the dataset doesn't have a fixed resolution images) than the conv net input size? How to random crop in that case, do i rescale it to higher resolution first before cropping back to input size? Also while cropping (realtime augmenting), what if the main object gets cut partially (since we cannot control this everytime)? Should that affect the classification?Revealment
Well in that case you would probably resize all the images to a common size (if you use Linux you can use ImageMagick), say NxM, and then crop say nxm, and then nxm would be your conv net input size, in that order. If the main object gets cropped, that can certainly happen. You can do some exploratory data analysis and take some test crops of random images and see if it passes the eye-test. None of this stuff is set in stone, there's a bit of an art to it all. Hope this helps.Magritte
Ok. Got it. Already using PIL(Pillow) for resizing. Anyway, thanks a lot.Revealment
No problem, here's a link to a Kaggle diabetes image classification challenge: Kaggle Diabetes. The dataset was large, full of images of different sizes as well as resolutions. There were some nifty image pre-processing tricks here and if you look through the forums I remember people sharing their code. You can look there and see how people handle this issue as well. I believe the winning submission was a conv net which used all sorts of data augmentation and pre-processing tricks.Magritte
Hi, could you answer this if you have experience with lasagne? #34590482Revealment
Can anyone clearly elaborate what exactly our conv net learns to associate a broader range of spatial activation statistics with a certain class label means as mentioned in the answer above. I bit confuse with this.Lehrer
@IndieAI How you ensure that the relative content of the image (e.g. contains a "car") is still there after random cropping?David
S
0

To answer the "how to implement cropping" question, you might want to explore https://github.com/aleju/imgaug. There is a Crop augmenter available that lets you do random cropping. And a lot of other fun augmenters.

Seminarian answered 9/2, 2018 at 2:25 Comment(0)
A
0

Based on the above answer by @Indie AI, the following piece of code may help you to implement random cropping:

from random import randrange
import numpy as np

def my_random_crop(vol, w, h):
"""
Given a volume with extra pixels, this functions randomly 
crops it by removing a specific number of pixels from each side of it.

:param vol: input volume with shape W*H*D
:param w: number of pixels to be removed from the width dimension
:param h: number of pixels to be removed from the height dimension
:return: a cropped volume
"""
vw = randrange(w) # valid corner point for the width
vh = randrange(h) # valid corner point for the height

rw = w - vw  # remaining width to be removed
rh = h - vh  # remaining height to be removed

width, height, depth = vol.shape
vol = vol[vw:width - rw, vh:height - rh, :]

return vol

As a quick test, run the following:

tmp = np.random.randn(64,128,32)
print(tmp.shape)
tmp = my_random_crop(tmp, w = 10, h = 15)
print(tmp.shape)
Alveolus answered 20/4, 2022 at 8:58 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.