detected Image and text in document.
Show older comments
0 down vote favorite
I have a binary image (image1). Now I want to detection where is the figure ( may be include big text) in original image. I use haar wavelet transform and detec a image B include some position may be the figure of A. (image 2). If I use image A - image B = image C (image 3) it may be not good be cause we have some boundary. Now I want remove the boundary or detect exactly the figure in image A? how to do that ?. I try use conected component but it run over time.
There is my image: Image A:

Image B:

Image C:

Image A- imageB =Image C ( that mean if A(i,j)==1 and B(i,j)==1 then C(i,j)=0;)
Please help me. Thank you so much
1 Comment
Tran Tuan Anh
on 23 May 2014
Answers (2)
Image Analyst
on 23 May 2014
Edited: Image Analyst
on 23 May 2014
0 votes
Take the image, call imfill(), then erode it enough to make the letters disappear. Then use imreconstruct. See attached demos.
6 Comments
Tran Tuan Anh
on 24 May 2014
Tran Tuan Anh
on 24 May 2014
Edited: Tran Tuan Anh
on 24 May 2014
Image Analyst
on 24 May 2014
OK, that's fine. You don't have to use my algorithm. If you have some algorithm from a paper that's working will for you, then that's fine.
Tran Tuan Anh
on 24 May 2014
Image Analyst
on 24 May 2014
It looks like B gets all the large blobs. There are a few small scattered dots around the big blobs and it's not picking those up. As far as it's concerned if it's small it could be text. If you want to capture the small surrounding dots, call imclose(). It will dilate the large blobs to engulf the small blobs or connect to nearby blobs, then it will erode to shrink it back down to the original size but without breaking any connections that were made during dialation.
closedImage = imclose(binaryImage, true(9)); % Use whatever window size you want.
Tran Tuan Anh
on 26 May 2014
Edited: Tran Tuan Anh
on 26 May 2014
Tran Tuan Anh
on 26 May 2014
Edited: Tran Tuan Anh
on 26 May 2014
6 Comments
Image Analyst
on 26 May 2014
You may have to have some sort of first pass to detect what kind of figure might be present and then use a different algorithm for each kind of figure.
Tran Tuan Anh
on 26 May 2014
Image Analyst
on 26 May 2014
Let's say you have algorithm1 that does a good job at spotting gray scale images on the page, and algorithm2 that does a good job handling line art. You might have some algorithm that recognized, just roughly and approximately, what kind of figure is there, and then apply algorithm1 or algorithm2 for better extraction of the figure, depending on what was found in the first pass.
Tran Tuan Anh
on 26 May 2014
Edited: Tran Tuan Anh
on 26 May 2014
Image Analyst
on 26 May 2014
Why don't you just threshold and find the areas of all the blobs? All the text will be in a narrow range. Any outliers (bigger or smaller) will be non-letters and might be considered as noise (if smaller) or part of a figure (if bigger).
Tran Tuan Anh
on 26 May 2014
Categories
Find more on Signal Analysis in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!


