I've looked everywhere but there doesn't seem to be a standard (I could see) of how one would go about checking to see if an image is blank. In C#
I have a way of doing this, but would love to know what the correct way is of checking to see if an image is blank, so everyone could also know in the future.
I'm not going to copy paste a bunch of code in, if you want me to, it will be my pleasure, but I just first want to explain how i go about checking to see if an image is blank.
You take a .jpg image, Get the width of it. For example 500 pixels Then you divide that by 2 giving you 250
Then you check what the colour of every pixel is in the location of (250 width, and i height) (where you iterate thought the hight of the image.
What this then do is only check the middle line of pixels of an image, vertically. It goes though all the pixels checking to see if the colour is anything Except white. I've done this so you wont have to search ALL 500*height of pixels and since you will almost always come across a colour in the middle of the page.
Its working... a bit slow...There must be a better way to do this? You can change it to search 2/3/4 lines vertically to increase your chance to spot a page that's not blank, but that will take even longer.
(Also note, using the size of the image to check if it contains something will not work in this case, since a page with two sentences on and a blank page's size is too close to one another)
After solution has been added.
Resources to help with the implementation and understanding of the solution.
- Writing unsafe code - pointers in C
- Using Pointers in C#
- /unsafe (C# Compiler Options)
- Bitmap.LockBits Method (Rectangle, ImageLockMode, PixelFormat)
(Note that on the first website, the stated Pizelformat is actually Pixelformat) - Small error i know, just mentioning, might cause some confusion to some.
After I implemented the method to speed up the pixel hunting, the speed didn't increase that much. So I would think I'm doing something wrong.
Old time = 15.63 for 40 images.
New time = 15.43 for 40 images
I saw with the great article DocMax quoted, that the code "locks" in a set of pixels. (or thats how i understood it) So what I did is lock in the middle row of pixels of each page. Would that be the right move to do?
private int testPixels(String sourceDir)
{
//iterate through images
string[] fileEntries = Directory.GetFiles(sourceDir).Where(x => x.Contains("JPG")).ToArray();
var q = from string x in Directory.GetFiles(sourceDir)
where x.ToLower().EndsWith(".jpg")
select new FileInfo(x);
int holder = 1;
foreach (var z in q)
{
Bitmap mybm= Bitmap.FromFile(z.FullName) as Bitmap;
int blank = getPixelData2(mybm);
if (blank == 0)
{
holder = 0;
break;
}
}
return holder;
}
And then the class
private unsafe int getPixelData2(Bitmap bm)
{
BitmapData bmd = bm.LockBits(new System.Drawing.Rectangle((bm.Width / 2), 0, 1, bm.Height), System.Drawing.Imaging.ImageLockMode.ReadOnly, bm.PixelFormat);
int blue;
int green;
int red;
int width = bmd.Width / 2;
for (int y = 0; y < bmd.Height; y++)
{
byte* row = (byte*)bmd.Scan0 + (y * bmd.Stride);
blue = row[width * 3];
green = row[width * 2];
red = row[width * 1];
// Console.WriteLine("Blue= " + blue + " Green= " + green + " Red= " + red);
//Check to see if there is some form of color
if ((blue != 255) || (green != 255) || (red != 255))
{
bm.Dispose();
return 1;
}
}
bm.Dispose();
return 0;
}