Image preprocessing in deep learning
Asked Answered
Y

5

21

I am experimenting with deep learning on images. I have about ~4000 images from different cameras with different light conditions, image resolutions and view angle.

My question is: What kind of image preprocessing would be helpful for improving object detection? (For example: contrast/color normalization, denoising, etc.)

Younts answered 2/1, 2017 at 14:44 Comment(6)
No one could answer this question unless they have a look at your data. generally with deep learning pre-processing is not necessary. your model can learn how to adapt to variation in your data if you have enough data.Marlonmarlow
Yes, I know my question was too general but your answer helped me. My real question is how sensitive is the deep learning to image quality?Younts
deep network or CNN has filters tend to learn on your data set. the big amount of data and variety you have the more robust your system will be. of course it is sensitive if your target domain is different than your training domain.Marlonmarlow
Another image preprocessing technique added to your list could be illumination correction. See THIS POST for more.Allsun
And also check THIS POST if you consider using gamma correction for your images.Allsun
Thank you for your help! I will try your suggestions.Younts
D
18

For pre-processing of images before feeding them into the Neural Networks. It is better to make the data Zero Centred. Then try out normalization technique. It certainly will increase the accuracy as the data is scaled in a range than arbitrarily large values or too small values.

An example image will be: -

enter image description here

Here is a explanation of it from Stanford CS231n 2016 Lectures.

*

Normalization refers to normalizing the data dimensions so that they are of approximately the same scale. For Image data There are two common ways of achieving this normalization. One is to divide each dimension by its standard deviation, once it has been zero-centered:
(X /= np.std(X, axis = 0)). Another form of this preprocessing normalizes each dimension so that the min and max along the dimension is -1 and 1 respectively. It only makes sense to apply this preprocessing if you have a reason to believe that different input features have different scales (or units), but they should be of approximately equal importance to the learning algorithm. In case of images, the relative scales of pixels are already approximately equal (and in range from 0 to 255), so it is not strictly necessary to perform this additional preprocessing step.

*

Link for the above extract:- http://cs231n.github.io/neural-networks-2/

Diabolic answered 7/7, 2017 at 7:37 Comment(2)
You're answer is contradicting itself. You write that normalization will certainly increase accuracy, but at the same time you quote a text which says that normalization may not be necessary under some circumstances.Mooncalf
It is not contradicting. Quote says it is not strictly necessary. So you use this step when there are room for improving accuracy for the model.Diabolic
G
13

This is certainly late reply for this post, but hopefully help who stumble upon this post.

Here's an article I found online Image Data Pre-Processing for Neural Networks, I though this certainly was a good in article into how the network should be trained.

Main gist of the article says

1) As data(Images) few into the NN should be scaled according the image size that the NN is designed to take, usually a square i.e 100x100,250x250

2) Consider the MEAN(Left Image) and STANDARD DEVIATION(Right Image) value of all the input images in your collection of a particular set of images

enter image description here

3) Normalizing image inputs done by subtracting the mean from each pixel and then dividing the result by the standard deviation, which makes convergence faster while training the network. This would resemble a Gaussian curve centred at zero enter image description here

4)Dimensionality reduction RGB to Grayscale image, neural network performance is allowed to be invariant to that dimension, or to make the training problem more tractable enter image description here

Goa answered 6/2, 2018 at 5:46 Comment(4)
Notable that the mean referred to in this article is the "global" mean. All images have the global mean subtracted, and not that each image has its own mean subtracted from itself.Foxtrot
@Foxtrot would it be okay if I do not calculate the mean and standard deviation of my data-set and instead use mean and std dev from some prominent data-sets like ImageNet or COCO which are readily available online?Ancohuma
Does mean subtraction help fight against performance degradation of the model due to illumination changes? As while testing with real life cases we may encounter with various lighting situations from bright light to low light.Ancohuma
@Ancohuma Are you using someone else's pretrained classifier/weights on your data? If so, you need to duplicate the preprocessing steps that was used in that other person's training process. However, if you're doing transfer learning (i.e., using someone else's pretrained classifier as a starting point for training a new classifier based on your data) then theoretically you could do whatever pretraining you want. I hope someone will correct me if what I'm saying is not correct but I think global subtractions etc aren't strictly necessary, but rather help the model converge, or converge faster.Foxtrot
L
3

In addition to what is mentioned above, a great way to improve the quality of Low-Resolution images(LR) would be to do super-resolution using deep learning. What this would mean is to make a deep learning model that would convert low-resolution image to high resolution. We can convert a high-resolution image to a low-resolution image by applying degradation functions(filters such as blurring). This would essentially mean LR = degradation(HR) where the degradation function would convert the high-resolution image to low resolution. If we can find the inverse of this function, then we convert a low-resolution image to a high resolution. This can be treated as a supervised learning problem and solved using deep learning to find the inverse function. Came across this interesting article on introduction to super-resolution using deep learning. I hope this helps.

Leatherworker answered 2/10, 2019 at 21:34 Comment(0)
L
2

Have a read through this, hopefully that will be helpful. The idea is to split the input image into parts. This is called R-CNN (here are some examples). There are two stages to this process, object detection and segmentation. Object detection is the process where certain objects in the foreground are detected by observing changes in gradient. Segmentation is the process where the objects are put together in an image with high contrast. High level image detectors use bayesian optimization which can detect what could happen next using the local optimization point.

Basically, in answer to your question, all of the pre-processing options you have given seem to be good. As the contrast and colour normalization makes the computer recognise different objects and denoising will make the gradients more easy to distinguish.

I hope all of this information is useful to you!

Linstock answered 2/1, 2017 at 14:59 Comment(2)
Link-only answers are generally not recommended. Please add the relevant parts from the link to your aswer. Links may become invalid over time.Litt
Thank you for your answer! Actually I am experimenting wit py-faster-rcnn so I heard about R-CNN. My problem is my dataset has variable quality of images and the real question is how sensitive the deep learning to image quality?Younts
S
0

In order for you to improve the image you first of all need to identify the issue in that image, e.g., low contrast, non-uniform illumination, etc. Once you are able to identify the issue in your dataset you will be able to find the right solution and apply it to it. That will able to improve your object detection accuracy and the results.

Severen answered 7/9, 2022 at 23:57 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.