Surprisingly, many quite simple autofocus algorithms actually performed quite well on this problem. I implemented 11 of the 16 algorithms outlined in the paper Dynamic evaluation of autofocusing for automated microscopic analysis of blood smear and pap smear by Liu, Wang & Sun. Since I had trouble finding recommendations for setting the threshold values, I also added some variants without thresholds. I also added a simple but clever suggestion found here on SO: compare the file size of the JPEG compressed images (larger size = more detail = better focus).
My autofocus routine does the following:
- Save 21 images at an interval of 2 µm focus distance, total range ±20 µm.
- Calculate the focus value of each image.
- Fit the result to a second degree polynomial.
- Find the position that gives a maximum value of the polynomial.
All algorithms except Histogram Range gave good results. Some of the algorithms are slightly modified, for example they use the brightness difference in both X & Y directions. I also had to change the sign of the StdevBasedCorrelation, Entropy, ThresholdedContent, ImagePower and ThresholdedImagePower algorithms to get a maximum instead of a minimum at the focus position. The algorithms expect a 24 bit grayscale image where R = G = B. If used on a color image, only the blue channel will be calculated (easily corrected of course).
The optimal threshold values was found by running the algorithms with threshold values 0, 8, 16, 24 etc up to 255 and selecting the best value for:
- Stable focus position
- Large negative x² coefficient resulting in a narrow peak at the focus position
- Low residual sum of squares from the polynomial fit
It's interesting to note that the ThresholdedSquaredGradient and ThresholdedBrennerGradient algorithms have an almost flat line of focus position, x² coefficient and residual sum of squares. They are very insensitive to changes in the threshold value.
Implementation of algorithms:
public unsafe List<Result> CalculateFocusValues(string filename)
{
using(Bitmap bmp = new Bitmap(filename))
{
int width = bmp.Width;
int height = bmp.Height;
int bpp = Bitmap.GetPixelFormatSize(bmp.PixelFormat) / 8;
BitmapData data = bmp.LockBits(new Rectangle(0, 0, width, height), ImageLockMode.ReadOnly, bmp.PixelFormat);
long sum = 0, squaredSum = 0;
int[] histogram = new int[256];
const int absoluteGradientThreshold = 148;
long absoluteGradientSum = 0;
long thresholdedAbsoluteGradientSum = 0;
const int squaredGradientThreshold = 64;
long squaredGradientSum = 0;
long thresholdedSquaredGradientSum = 0;
const int brennerGradientThreshold = 184;
long brennerGradientSum = 0;
long thresholdedBrennerGradientSum = 0;
long autocorrelationSum1 = 0;
long autocorrelationSum2 = 0;
const int contentThreshold = 35;
long thresholdedContentSum = 0;
const int pixelCountThreshold = 76;
long thresholdedPixelCountSum = 0;
const int imagePowerThreshold = 40;
long imagePowerSum = 0;
long thresholdedImagePowerSum = 0;
for(int row = 0; row < height - 1; row++)
{
for(int col = 0; col < width - 1; col++)
{
int current = *((byte *) (data.Scan0 + (row + 0) * data.Stride + (col + 0) * bpp));
int col1 = *((byte *) (data.Scan0 + (row + 0) * data.Stride + (col + 1) * bpp));
int row1 = *((byte *) (data.Scan0 + (row + 1) * data.Stride + (col + 0) * bpp));
int squared = current * current;
sum += current;
squaredSum += squared;
histogram[current]++;
int colDiff1 = col1 - current;
int rowDiff1 = row1 - current;
int absoluteGradient = Math.Abs(colDiff1) + Math.Abs(rowDiff1);
absoluteGradientSum += absoluteGradient;
if(absoluteGradient >= absoluteGradientThreshold)
thresholdedAbsoluteGradientSum += absoluteGradient;
int squaredGradient = colDiff1 * colDiff1 + rowDiff1 * rowDiff1;
squaredGradientSum += squaredGradient;
if(squaredGradient >= squaredGradientThreshold)
thresholdedSquaredGradientSum += squaredGradient;
if(row < bmp.Height - 2 && col < bmp.Width - 2)
{
int col2 = *((byte *) (data.Scan0 + (row + 0) * data.Stride + (col + 2) * bpp));
int row2 = *((byte *) (data.Scan0 + (row + 2) * data.Stride + (col + 0) * bpp));
int colDiff2 = col2 - current;
int rowDiff2 = row2 - current;
int brennerGradient = colDiff2 * colDiff2 + rowDiff2 * rowDiff2;
brennerGradientSum += brennerGradient;
if(brennerGradient >= brennerGradientThreshold)
thresholdedBrennerGradientSum += brennerGradient;
autocorrelationSum1 += current * col1 + current * row1;
autocorrelationSum2 += current * col2 + current * row2;
}
if(current >= contentThreshold)
thresholdedContentSum += current;
if(current <= pixelCountThreshold)
thresholdedPixelCountSum++;
imagePowerSum += squared;
if(current >= imagePowerThreshold)
thresholdedImagePowerSum += current * current;
}
}
bmp.UnlockBits(data);
int pixels = width * height;
double mean = (double) sum / pixels;
double meanDeviationSquared = (double) squaredSum / pixels;
int rangeMin = 0;
while(histogram[rangeMin] == 0)
rangeMin++;
int rangeMax = histogram.Length - 1;
while(histogram[rangeMax] == 0)
rangeMax--;
double entropy = 0.0;
double log2 = Math.Log(2);
for(int i = rangeMin; i <= rangeMax; i++)
{
if(histogram[i] > 0)
{
double p = (double) histogram[i] / pixels;
entropy -= p * Math.Log(p) / log2;
}
}
return new List<Result>()
{
new Result("AbsoluteGradient", absoluteGradientSum),
new Result("ThresholdedAbsoluteGradient", thresholdedAbsoluteGradientSum),
new Result("SquaredGradient", squaredGradientSum),
new Result("ThresholdedSquaredGradient", thresholdedSquaredGradientSum),
new Result("BrennerGradient", brennerGradientSum),
new Result("ThresholdedBrennerGradient", thresholdedBrennerGradientSum),
new Result("Variance", meanDeviationSquared - mean * mean),
new Result("Autocorrelation", autocorrelationSum1 - autocorrelationSum2),
new Result("StdevBasedCorrelation", -(autocorrelationSum1 - pixels * mean * mean)),
new Result("Range", rangeMax - rangeMin),
new Result("Entropy", -entropy),
new Result("ThresholdedContent", -thresholdedContentSum),
new Result("ThresholdedPixelCount", thresholdedPixelCountSum),
new Result("ImagePower", -imagePowerSum),
new Result("ThresholdedImagePower", -thresholdedImagePowerSum),
new Result("JpegSize", new FileInfo(filename).Length),
};
}
}
public class Result
{
public string Algorithm { get; private set; }
public double Value { get; private set; }
public Result(string algorithm, double value)
{
Algorithm = algorithm;
Value = value;
}
}
To be able to plot and compare the focus values of the different algorithms they were scaled to a value between 0 and 1 (scaled = (value - min)/(max - min)
).
Plot of all algorithms for a range of ±20 µm:
0 µm | 20 µm
Things look quite similar for a range of ±50 µm:
0 µm | 50 µm
When using a range of ±500 µm things get more interesting. Four algorithms exhibit more of a fourth degree polynomial shape, and the others start to look more like Gaussian functions. Also, the Histogram Range algorithm start to perform better than for smaller ranges.
0 µm | 500 µm
Overall I'm quite impressed by the performance and consistency of these simple algorithms. With the naked eye, it's quite hard to tell that even the 50 µm image is out of focus but the algorithms have no problem comparing images just a few microns apart.