The idea is really simple. Use morphology to isolate the text you want to detect. Using this image, create a mask to delete the region of interest in the input image and produce a final image. All via morphology. My answer is in C++
, but the implementation is really easy:
//Read input image:
std::string imagePath = "C://opencvImages//commentImage.png";
cv::Mat imageInput= cv::imread( imagePath );
//Convert it to grayscale:
cv::Mat grayImg;
cv::cvtColor( imageInput, grayImg, cv::COLOR_BGR2GRAY );
//Get binary image via Otsu:
cv::threshold( grayImg, grayImg, 0, 255 , cv::THRESH_OTSU );
Up until this point, you have generated the binary image. Now, let's dilate
the image using a rectangular structuring element (SE
) wider than taller. The idea is that I want to join all the text horizontally AND vertically (just a little bit). If you see the input image, the “TEST132212” text is just a little bit separated from the comment, enough to survive the dilate
operation, it seems. Let's see, here, I'm using a SE
of size 9 x 6
with 2
iterations:
cv::Mat morphKernel = cv::getStructuringElement( cv::MORPH_RECT, cv::Size(9, 6) );
int morphIterations = 2;
cv::morphologyEx( grayImg, grayImg, cv::MORPH_DILATE, morphKernel, cv::Point(-1,-1), morphIterations );
This is the result:
I got a unique block where the original comment was - Nice! Now, this is the largest blob in the image. If I subtract it to the original binary image, I should generate a mask that will successfully isolate everything that is not the “comment” blob:
cv::Mat bigBlob = findBiggestBlob( grayImg );
I get this:
Now, the binary mask generation:
cv::Mat binaryMask = grayImg - bigBlob;
//Use the binaryMask to produce the final image:
cv::Mat resultImg;
imageInput.copyTo( resultImg, binaryMask );
Produces the masked image:
Now, you should have noted the findBiggestBlob
function. This is a function I've made that returns the biggest blob in a binary image. The idea is just to compute all the contours in the input image, calculate their area and store the contour with the largest area of the bunch. This is the C++
implementation:
//Function to get the largest blob in a binary image:
cv::Mat findBiggestBlob( cv::Mat &inputImage ){
cv::Mat biggestBlob = inputImage.clone();
int largest_area = 0;
int largest_contour_index=0;
std::vector< std::vector<cv::Point> > contours; // Vector for storing contour
std::vector<cv::Vec4i> hierarchy;
// Find the contours in the image
cv::findContours( biggestBlob, contours, hierarchy,CV_RETR_CCOMP, CV_CHAIN_APPROX_SIMPLE );
for( int i = 0; i< (int)contours.size(); i++ ) {
//Find the area of the contour
double a = cv::contourArea( contours[i],false);
//Store the index of largest contour:
if( a > largest_area ){
largest_area = a;
largest_contour_index = i;
}
}
//Once you get the biggest blob, paint it black:
cv::Mat tempMat = biggestBlob.clone();
cv::drawContours( tempMat, contours, largest_contour_index, cv::Scalar(0),
CV_FILLED, 8, hierarchy );
//Erase the smaller blobs:
biggestBlob = biggestBlob - tempMat;
tempMat.release();
return biggestBlob;
}
Edit: Since the posting of the answer, I've been learning Python
. Here's the Python
equivalent of the C++
code:
import cv2
import numpy as np
# Set image path
path = "D://opencvImages//"
fileName = "commentImage.png"
# Read Input image
inputImage = cv2.imread(path+fileName)
# Convert BGR to grayscale:
grayscaleImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)
# Threshold via Otsu + bias adjustment:
threshValue, binaryImage = cv2.threshold(grayscaleImage, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)
# Set kernel (structuring element) size:
kernelSize = (9, 6)
# Set operation iterations:
opIterations = 2
# Get the structuring element:
morphKernel = cv2.getStructuringElement(cv2.MORPH_RECT, kernelSize)
# Perform Dilate:
openingImage = cv2.morphologyEx(binaryImage, cv2.MORPH_DILATE, morphKernel, None, None, opIterations, cv2.BORDER_REFLECT101)
# Find the big contours/blobs on the filtered image:
biggestBlob = openingImage.copy()
contours, hierarchy = cv2.findContours(biggestBlob, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)
contoursPoly = [None] * len(contours)
boundRect = []
largestArea = 0
largestContourIndex = 0
# Loop through the contours, store the biggest one:
for i, c in enumerate(contours):
# Get the area for the current contour:
currentArea = cv2.contourArea(c, False)
# Store the index of largest contour:
if currentArea > largestArea:
largestArea = currentArea
largestContourIndex = i
# Once you get the biggest blob, paint it black:
tempMat = biggestBlob.copy()
# Draw the contours on the mask image:
cv2.drawContours(tempMat, contours, largestContourIndex, (0, 0, 0), -1, 8, hierarchy)
# Erase the smaller blobs:
biggestBlob = biggestBlob - tempMat
# Generate the binary mask:
binaryMask = openingImage - biggestBlob
# Use the binaryMask to produce the final image:
resultImg = cv2.bitwise_and(inputImage, inputImage, mask = binaryMask)
cv2.imshow("Result", resultImg)
cv2.waitKey(0)