Most efficient way to separate numerically labeled objects that share borders?

Question

0 votes

Consider the following image mask:

IM = [0 0 0 1 1 1 1
0 1 1 1 1 1
2 2 2 1 1 1
2 2 2 2 1 0
2 2 2 2 0 0];

Objects 1 and 2 share a diagonal border. If I converted this image as shown above into a logical and passed it to regionprops, it would detect only 1 (merged) object, not 2.

I'm looking for the most computationally efficient way to separate these objects. The end result would look something like this:

want = [0 0 0 1 1 1 1
        0 0 0 0 1 1 1
        0 2 2 0 0 1 1
        2 2 2 2 0 0 0
        2 2 2 2 2 0 0];

...where regionprops(logical(want)) would result in 2 objects being detected.

The reason I'm asking is that my actual masks are 25000 x 25000 in size, with about 100,000 objects in each. Separating them would be more efficient in terms of both storage and downstream processing. Also, I'm aware that separating these objects will erode them a bit, and a secondary goal is to minimize the amount of erosion and preserve their overall structures, as some of the actual objects can be as little as 10 pixels wide (they are narrow cells).

So far I've thought of some ways to do this, including brute-forcing it by systematically checking each pixel and converting it to a 0 if one of its neighbors is anything other than a 0 or itself, e.g:

IM2 = padarray(IM, [1 1], 0, 'both'); % make some space
for i = 2:size(IM2, 1)-1
    for j = 2:size(IM2, 2)-1
        % get that pixel and its neighbors
        w = IM2(i-1:i+1, j-1:j+1);
        % if bordering another object, turn it to 0
        if any(~ismember(w(:), [0 IM2(i, j)])) 
            IM2(i, j) = 0;
        end
    end
end
out = IM2(2:end-1, 2:end-1); % remove padding

While this method technically works, it's time-consuming for a large image (about 6 hours per 25000 x 25000) and I have quite a few of these images...

I've also tried various other methods, including isolating individual objects, doing morphological erosion on each, and then mapping the results back on the original image, but this is turned out to be far worse than the brute-force method in terms of both computation time and structural preservation.

Does anyone here have experience with efficiently separating these kinds of objects, which are numerically labeled and could be adjacent to each other, such that they can be converted and stored as logical images without losing too much of their morphological properties?

2 Comments
Show None Hide None

KSSV on 10 Apr 2023

Have a look on bwlabel

Ran Yang on 10 Apr 2023

bwlabel does not solve this issue because when it converts IM above to logical, the distinction between objects 1 and 2 is lost. They need to be separated before bwlabel can be called.

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Ran Yang on 10 Apr 2023

Edited: Ran Yang on 11 Apr 2023

Open in MATLAB Online

0 votes

sample.mat

---- Update on 2023-04-10 ----

Mostly figured this out. I just had to isolate each object in its bounding box instead of running imdilate across the entire image N times (code updated below).

load('sample.mat') % IM = sample 1000x1000 image

figure, imshow(label2rgb(IM, 'parula', 'k')); title('Adjacent objects');

IM2 = padarray(IM, [1 1], 0, 'both');

% reorder objects 1,2,3... (can comment this out if they are already in order)

IM2 = uint32(categorical(IM2)); IM2 = IM2 - 1;

b = false(size(IM2)); % allocate master boundary mask

R = regionprops(IM2);

for i = 1:size(R, 1)

% isolate each object in its boundingbox

bb = R(i).BoundingBox;

obj = IM2(floor(bb(2)):ceil(bb(2)+bb(4)), floor(bb(1)):ceil(bb(1)+bb(3)));

% remove other objects in this boundingbox

obj(obj ~= i) = 0;

% determine object perim in its boundingbox

obj = imdilate(obj, strel('disk', 1)) - obj;

% locate corresponding bbox in b and append to it

b(floor(bb(2)):ceil(bb(2)+bb(4)), floor(bb(1)):ceil(bb(1)+bb(3))) = ...

b(floor(bb(2)):ceil(bb(2)+bb(4)), floor(bb(1)):ceil(bb(1)+bb(3))) | obj;

end

out = IM2; out(b) = 0; % remove all boundaries

out = out(2:end-1, 2:end-1); % remove padding

figure, imshow(label2rgb(out, 'parula', 'k')); title('Separated objects');

This method is really fast. On the original 25000 x 25000 image with 120,000 objects, it finishes in ~90 seconds (as opposed to my original method, which took 40+ hours).

The only issue now is what is the best algorithm to preserve as much of each object as possible (or as Image Analyst pointed out in a comment, how to determine boundaries of adjacent objects). The above code uses imdilate to define a border around each object as an efficient approximation. This will erode away some objects if they're too small, though I currently don't see a way around sacrificing some morphology.

---- Original idea ----

After giving this some more thought, I've come up with a faster solution, which is to just calculate and remove the perimeters of each object. Method 1 does this via bwperim, while method 2 uses a dilated object minus the original.

% u = nonzeros(unique(IM(:)));
% b = false(size(IM)); % allocate boundary mask
% for i = 1:size(u, 1)
%     % isolate each object
%     obj = IM == u(i);
%     % method 1: find object inner perimeter and append to b
%     b(bwperim(obj)) = true;
%     % method 2: find object outer perimeter and append to b
%     % b(logical(imdilate(obj, strel('disk', 1)) - obj)) = true;
% end
% out = IM;
% out(b) = 0; % split objects by setting their boundaries to 0

Method 1 takes ~6% of the time of my initial brute-force method, but indescriminately erodes every object by a pixel, which might be an issue for smaller objects.

Method 2 is a little slower (~14%), but is better at preserving small objects.

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Answer 2

Image Analyst on 10 Apr 2023

Open in MATLAB Online

0 votes

You don't need bwlabel, like KSSV suggested, because your matrix is already labeled. You can simply use

oneBlob = ismember(IM, numberYouWant);

I don't see why bwperim() would not preserve small objects. I guess I'd need to see an example of that to prove it, like

mask = false(4,4);
mask(2, 2:3) = true
mask = 4×4 logical array
   0   0   0   0
   0   1   1   0
   0   0   0   0
   0   0   0   0
perims = bwperim(mask) % See?  All preserved
perims = 4×4 logical array
   0   0   0   0
   0   1   1   0
   0   0   0   0
   0   0   0   0

8 Comments
Show 6 older comments Hide 6 older comments

Ran Yang on 10 Apr 2023

Open in MATLAB Online

sample.mat

Sorry, I think my problem statement wasn't clear: I'm trying to separate objects such that when the image is converted to logical, the "separation" between objects is preserved.

Yes, my objects are already labeled. The issue is that sometimes, there are no zeros separating objects 1 and 2, such as in this example:

IM = repmat([1 1 1 2 2 2 0 0 3 3 3], 4, 1) % 3 objects in this 'image'
IM = 4×11
     1     1     1     2     2     2     0     0     3     3     3
     1     1     1     2     2     2     0     0     3     3     3
     1     1     1     2     2     2     0     0     3     3     3
     1     1     1     2     2     2     0     0     3     3     3

If I save this as a logical, I lose the information that there are 3 distinct objects.

IMR = regionprops(logical(IM));
size(IMR) % detects only 2 objects because 1 and 2 are fused (bad!)
ans = 1×2
     2     1

What I want is to "physically" separate each object, even if it means that I'll lose a little bit of object integrity.

want = repmat([1 1 0 2 2 2 0 0 3 3 3], 4, 1)
want = 4×11
     1     1     0     2     2     2     0     0     3     3     3
     1     1     0     2     2     2     0     0     3     3     3
     1     1     0     2     2     2     0     0     3     3     3
     1     1     0     2     2     2     0     0     3     3     3
wantR = regionprops(logical(want));
size(wantR) % detects 3 objects (good!)
ans = 1×2
     3     1

The issue is how do I efficiently go from IM --> want. It doesn't have to be precisely what's shown, but I need at least a solid border of zeros between objects 1 and 2, kind of like watershedding, in a way.

You're correct that bwperim by itself doesn't erode objects. But what I'm doing is using the result of bwperim as a mask, and assigning that mask as 0 in IM in order to try to separate object 1 from 2. This by definition will erode every object by about a pixel around its perimeter, as shown below:

u = nonzeros(unique(IM(:)));
b = false(size(IM)); % allocate boundary mask
for i = 1:size(u, 1)
    % isolate each object
    obj = IM == u(i);
    % method 1: find object inner perimeter and append to b
    b(bwperim(obj)) = true;
end
out = IM;
out(b) = 0 % split objects by setting their boundaries to 0
out = 4×11
     0     0     0     0     0     0     0     0     0     0     0
     0     1     0     0     2     0     0     0     0     3     0
     0     1     0     0     2     0     0     0     0     3     0
     0     0     0     0     0     0     0     0     0     0     0

You'll notice that all 3 objects are a lot smaller in out than in want. In particular, object 3 didn't need to be eroded at all, because it was already separated from 1 and 2 in the original image. But as I mentioned, this isn't a huge deal if the objects are sufficiently large. For small objects like what is shown above though, this method could significantly alter their properties. It is, however, the fastest method I've figured so far.

Hope that clarifies things a little. If it would help, I've also attached a small sample of the image I'm working with. You'll find that there are 448 (already labeled) objects, but converting the image directly to binary returns only 108, due to objects being fused together. The above bwperim method returns 444 objects, which is not ideal, but probably good enough.

load('sample.mat'); % sample IM containing 448 distinctly labeled objects
length(nonzeros(unique(IM(:))))
ans = 448
size(regionprops(logical(IM))) % direct conversion to logical merges objects!
ans = 1×2
   108     1
u = nonzeros(unique(IM(:)));
b = false(size(IM)); % allocate boundary mask
for i = 1:size(u, 1)
    % isolate each object
    obj = IM == u(i);
    % method 1: find object inner perimeter and append to b
    b(bwperim(obj)) = true;
end
out = IM;
out(b) = 0; % split objects by setting their boundaries to 0
size(regionprops(logical(out)))
ans = 1×2
   444     1

Ran Yang on 11 Apr 2023

Edited: Ran Yang on 11 Apr 2023

Long story short: It doesn't matter as long as the objects are separated such that regionprops still recognizes they're distinct objects after image is binarized.

Ideally, there would be minimal disruption (i.e. minimize the difference between regionprops(IM) and regionprops(logical(IM))), but I accept that there will be somewhat arbitrary definitions of boundaries, particularly when they're not neat lines in cardinal directions, and that there will be some losses in object morphology.

As a side note: I will take another crack at this later, but my current guess is that my bwperim and imdilate methods are slow because they're unnecessarily passing over an entire image N times, even though each object in isolation only takes up a tiny fraction of the actual image. Maybe using the bounding box of each object (instead of obj = IM == u(i);) would speed it up....

Also, now that I think of it, maybe a better question title would have been: "How can I minimize the difference between regionprops(IM) and regionprops(logical(IM))?"

Image Analyst on 11 Apr 2023

Open in MATLAB Online

Like I said, if you cast your labeled matrix to logical, then all your labeled blobs coalesce into one giant blob and regionprops will only give you measurements on the giant blob, not individual blobs.

I really don't know why you think you need to split apart the blobs to call regionprops rather than just put in IM directly and get the measurements for all the blobs at once. Not really sure what you want to measure but if you want the area and perimeter, for all hundred or so blobs, you can simply do this:

props = regionprops(IM, 'table', 'Area', 'Perimeter')

No need to split them apart and measure each one at a time.

Ran Yang on 11 Apr 2023

As mentioned before, I want to save my labeled matrix as logical for storage and downstream processing reasons. Please trust me that my reasons for doing this are sound. I wouldn't go out of my way to do this if there were a simpler solution.

I'm fully aware of how regionprops works, and the issue has nothing to do with regionprops. I have been doing image processing and data analysis in Matlab for nearly 10 years, and I am 100% confident that regionprops alone will not give me the result I'm looking for.

I don't care about the perimeter length, I care about the perimeter pixels. You asked if there are adjacent regions with different labels, whose pixels do I set to zero? There is your answer. Now I need to do that 100,000+ times per image, and quickly.

If you're still unconvinced about my concepts or methods, I strongly recommend that you check out my updated official Answer, and compare the before and after images. I just processed one of my original images and reduced the size from 1.64 GB to 10.3 MB, while keeping more than 95% of objects intact. All done in 1.5 minutes.

Once again, thank you for your time. Attempting to explain my issue to you helped me to come up with a solution on my own.

Sign in to comment.

Most efficient way to separate numerically labeled objects that share borders?

2 Comments
Show None Hide None

Answers (2)

0 Comments
Show -2 older comments Hide -2 older comments

8 Comments
Show 6 older comments Hide 6 older comments

Categories

Tags

Community Treasure Hunt

Most efficient way to separate numerically labeled objects that share borders?

2 Comments Show None Hide None

Answers (2)

0 Comments Show -2 older comments Hide -2 older comments

8 Comments Show 6 older comments Hide 6 older comments

Categories

Tags

See Also

Community Treasure Hunt

2 Comments
Show None Hide None

0 Comments
Show -2 older comments Hide -2 older comments

8 Comments
Show 6 older comments Hide 6 older comments