I am trying to implement Epshtein's paper(Detecting text in natural scenes with stroke width transform(2010)) on text detection in natural images. First step is edge detection.
I am getting some extra edges inside my text. How should I remove those?
Original image:
My edge detection:
In the example, you can see extra edges in the text 'WHY HURRY'
I have tried these steps in Matlab:
% contrast enhancement
I_adjust = imadjust(I);
% dilation & erosion
se = strel(ones(3,3));
I_dilate = imdilate(I_adjust, se);
I_final = imerode(I_dilate, se);
% gaussian smoothing
h_mask = fspecial('gaussian');
I_final = imfilter(I_final,h_mask);
figure; imshow(I_final);
BW_canny = edge(I_final,'canny');
figure; imshow(BW_canny);
Problem #2:
As per belisarius's suggestion, I found that mean-shift filter works quite well for text region segmentation. Now I am facing another problem in the implementation of Stroke Width transform(look at Epshtein's paper).
Stroke Width works well with chars like 'H''Y' even for 'S' because the corresponding edges are usually at constant distance if we proceed in the direction of gradient.
Problem comes in chars like 'W'. For one portion of left edge of 1st upstroke we get the right edge of 2nd upstoke as its correspoding edge. Whereas for another portion, we get right edge of 1st upstroke. This introduces significant variance in the stroke width of the region of 'W' leading to terming this as non-text region according to paper.
Can anyone suggest any solution?