
Detect the vertical and horizontal lines and than crop the area that is hand written
3 views (last 30 days)
Show older comments
Raja Bilal Rsb
on 18 Jul 2017
Commented: Raja Bilal Rsb
on 22 Jul 2017
Hello, I have a complete data set of images like this around 1000+, what i want to do is to detect the vertical and horizontal lines and than crop the area that is hand written. After cropping each hand written character should be saved as an individual image. Any help will be appreciated. Thanks

0 Comments
Accepted Answer
Image Analyst
on 19 Jul 2017
This code will do it. Try it and let me know. Then adapt it by making it a function and putting it inside a loop, like you can get from the FAQ, to process the other thousand images. It will save each cropped image with the row and column number from where it came from into the folder of the original image. Once you verify that it works, you can comment out the questdlg() to have it run without showing you each small cropped image and prompting you to continue.
clc; % Clear the command window.
close all; % Close all figures (except those of imtool.)
clear; % Erase all existing variables. Or clearvars if you want.
workspace; % Make sure the workspace panel is showing.
format long g;
format compact;
fontSize = 15;
%===============================================================================
% Get the name of the image the user wants to use.
baseFileName = '013.jpg';
% Get the full filename, with path prepended.
folder = pwd
fullFileName = fullfile(folder, baseFileName);
% Check if file exists.
if ~exist(fullFileName, 'file')
% The file doesn't exist -- didn't find it there in that folder.
% Check the entire search path (other folders) for the file by stripping off the folder.
fullFileNameOnSearchPath = baseFileName; % No path this time.
if ~exist(fullFileNameOnSearchPath, 'file')
% Still didn't find it. Alert user.
errorMessage = sprintf('Error: %s does not exist in the search path folders.', fullFileName);
uiwait(warndlg(errorMessage));
return;
end
end
%===============================================================================
% Read in demo image.
rgbImage = imread(fullFileName);
% Get the dimensions of the image.
[imageRows, imageColumns, numberOfColorChannels] = size(rgbImage);
% Display the original image.
subplot(2, 2, 1);
imshow(rgbImage, []);
axis on;
caption = sprintf('Original Color Image, %s', baseFileName);
title(caption, 'FontSize', fontSize, 'Interpreter', 'None');
hp = impixelinfo();
% Set up figure properties:
% Enlarge figure to full screen.
set(gcf, 'Units', 'Normalized', 'OuterPosition', [0 0 1 1]);
% Get rid of tool bar and pulldown menus that are along top of figure.
% set(gcf, 'Toolbar', 'none', 'Menu', 'none');
% Give a name to the title bar.
set(gcf, 'Name', 'Demo by ImageAnalyst', 'NumberTitle', 'Off')
drawnow;
hp = impixelinfo(); % Set up status line to see values when you mouse over the image.
% Extract the individual red, green, and blue color channels.
redChannel = rgbImage(:, :, 1);
% greenChannel = rgbImage(:, :, 2);
% blueChannel = rgbImage(:, :, 3);
% Threshold to get the mask.
mask = redChannel > 224;
% Get rid of the white surround.
mask = imclearborder(mask);
% Fill holes.
mask = imfill(mask, 'holes');
% Make sure each blob is a minimum of 150 by 150 pixels.
mask = bwareafilt(mask, [150*150, inf]);
% Erode by 10 pixels to get rid of any black lines
% in the bounding box due to it being rotated.
mask = imerode(mask, ones(10));
% Display the image.
subplot(2, 2, 2);
imshow(mask);
grid on;
axis on;
hold on;
title('Mask Image', 'FontSize', fontSize);
drawnow;
% Label the image
[labeledImage, numBlobs] = bwlabel(mask);
% Measure bounding boxes:
props = regionprops(labeledImage, 'BoundingBox', 'Centroid');
allCentroids = [props.Centroid];
xCentroids = allCentroids(1:2:end);
yCentroids = allCentroids(2:2:end);
% There are 14 rows and 10 columns.
% Find the average of them using kmeans
[indexes, xClusterCtr] = kmeans(xCentroids', 10);
[indexes, yClusterCtr] = kmeans(yCentroids', 14);
% Sort them in ascending order.
xClusterCtr = sort(xClusterCtr, 'ascend')
yClusterCtr = sort(yClusterCtr, 'ascend')
% Loop through them
for k = 1 : numBlobs
% Get the bounding box of this blob.
thisBB = props(k).BoundingBox;
% Crop out the small box.
thisCroppedImage = imcrop(rgbImage, thisBB);
% Display the image.
subplot(2, 2, 3);
imshow(thisCroppedImage, []);
axis on;
caption = sprintf('Blob #%d', k);
title(caption, 'FontSize', fontSize);
% Find out which row and column this is in
thisX = xCentroids(k);
thisY = yCentroids(k);
distancesX = sqrt((thisX - xClusterCtr) .^ 2);
distancesY = sqrt((thisY - yClusterCtr) .^ 2);
[minDistance, column] = min(distancesX);
[minDistance, row] = min(distancesY);
fprintf('Blob #%d is at (%.1f, %.1f) in row %d, column %d\n', ...
k, thisX, thisY, row, column);
% Plot a star there.
subplot(2, 2, 2);
% Plot actual centroid.
plot(thisX, thisY, 'r*', 'MarkerSize', 10, 'LineWidth', 2);
% Plot star at grid crossing lines.
plot(xClusterCtr(column), yClusterCtr(row), 'b*', 'MarkerSize', 10, 'LineWidth', 2);
% Prepare filename.
baseFileName = sprintf('Row %d, Column %d.png', row, column);
fullFileName = fullfile(folder, baseFileName);
imwrite(thisCroppedImage, fullFileName);
% Pause to prompt user.
promptMessage = sprintf('Saved image as %s\nDo you want to Continue processing,\nor Quit processing?', fullFileName);
titleBarCaption = 'Continue?';
buttonText = questdlg(promptMessage, titleBarCaption, 'Continue', 'Quit', 'Continue');
if strcmpi(buttonText, 'Quit')
break;
end
end

18 Comments
Image Analyst
on 21 Jul 2017
Thanks for getting back. It's really strange though because I don't see any reason why an earlier version would give fewer files. If you had an old version, prior to bwareafilt(), it would have thrown an error, but to complete without any errors, and just not have the right number of files is really weird.
More Answers (1)
Kevin Xia
on 18 Jul 2017
Edited: Kevin Xia
on 18 Jul 2017
You can use bwconncomp and regionprops to find the centroids of each whitespace box. The centroids can be used to generate subimages, "cropping" the target image. Here is an example:
Read image to file and convert it to a black and white image using imbinarize. Note that the threshold being used is 0.8. The threshold will depend on the image. You can use the greythresh function to dynamically generate the threshold, but the threshold may still have to be tuned.
I=imread('013.jpg'); %insert image name in place of ‘013.jpg’
greyIm=rgb2gray(I);
bwIm=imbinarize(greyIm,0.8);
Find the connected regions in the array using bwconncomp:
numPixels=cellfun(@numel,CC.PixelIdxList); %For each connected component, calculate the number of pixels.
boxIndices=find(numPixels>22500);
Calculate the centroids of all connected regions. Refer to the documentation of regionprops for more detail:
S=regionprops(CC,'Centroid');
centroids=cat(1,S.Centroid);
Verify that all whitespace box centroids have been found:
figure;
imshow(bwIm);
hold on
plot(boxCentroids(:,1),boxCentroids(:,2),'b*')
hold off
Create subimages from the whitespace box centroids using matrix indexing on the black and white image. In this case, I created a 200x200 pixel subimage around the second box centroid. Note that the centroids are floating point numbers, and have to be rounded using ceil to integers. This can be automated with a loop:
%obtaining one box:
Xrange=ceil(boxCentroids(2,1))-100:ceil(boxCentroids(2,1))+100; %each box is approximately 200x200 pixels.
Yrange=ceil(boxCentroids(2,2))-100:ceil(boxCentroids(2,2))+100;
boxIm=bwIm(Yrange,Xrange);
figure;
imshow(boxIm)
One of the box centroids (using the sample image the first one) is actually the centroid of the grid, and thus will produce a noncentered image. All other centroids should be the centroid of the whitespace boxes. Imwrite can be used to save the image.
5 Comments
Image Analyst
on 19 Jul 2017
Edited: Image Analyst
on 19 Jul 2017
After I fixed the first few problems, there were more. And the more I fixed it, the more it started to approach my code, for example you'd need to call imclearborder(), imfill(), fix the output filename, get the cropped image size correct, etc. So might as well just use my code, which already works. There is a reason my code is usually longer than others - it's flexible, general, robust, and extensively commented.
See Also
Categories
Find more on Startup and Shutdown in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!