I'm using Curl via Proxies to download images with a scraper I have developed.
Unfortunately, it gets the odd image which looks like these and the last one is completely blank :/
- When I test the images via imagemagick (using identify) it tells me they are valid images.
- When I test the images via exif_imagetype() and imagecreatefromjpeg() again, both these functions tell me the images are valid.
Does anyone have a way to determine if the image has majority of greyness or is completely blank/white and these are indeed corrupted images?
I have done a lot of checking with other questions on here, but I haven't had much luck with other solutions. So please take care in suggesting this is a duplicate.
Thanks
After knowing about imgcolorat, I did a search and stumbled on some code. I came up with this:
<?php
$file = dirname(__FILE__) . "/images/1.jpg";
$img = imagecreatefromjpeg($file);
$imagew = imagesx($img);
$imageh = imagesy($img);
$xy = array();
$last_height = $imageh - 5;
$foo = array();
$x = 0;
$y = 0;
for ($x = 0; $x <= $imagew; $x++)
{
for ($y = $last_height;$y <= $imageh; $y++ )
{
$rgb = @imagecolorat($img, $x, $y);
$r = ($rgb >> 16) & 0xFF;
$g = ($rgb >> 8) & 0xFF;
$b = $rgb & 0xFF;
if ($r != 0)
{
$foo[] = $r;
}
}
}
$bar = array_count_values($foo);
$gray = (isset($bar['127']) ? $bar['127'] : 0) + (isset($bar['128']) ? $bar['128'] : 0) + (isset($bar['129']) ? $bar['129'] : 0);
$total = count($foo);
$other = $total - $gray;
if ($gray > $other)
{
echo "image corrupted \n";
}
else
{
echo "image not corrupted \n";
}
?>
Anyone see some potential pitfalls with this? I thought about getting the last few rows of the image and then comparing the total of r 127,128,129 (which are gray) against the total of other colours. If gray is greater than the other colours then the image is surely corrupted.
Opinions welcome! :)