Main Content

segmentObjects

Segment objects using SOLOv2 instance segmentation

Since R2023b

Description

masks = segmentObjects(detector,I) segments objects within a single image or array of images I using SOLOv2 instance segmentation, and returns the predicted object masks for the input image or images.

Note

This functionality requires Deep Learning Toolbox™ and the Computer Vision Toolbox™ Model for SOLOv2 Instance Segmentation. You can install the Computer Vision Toolbox Model for SOLOv2 Instance Segmentation from Add-On Explorer. For more information about installing add-ons, see Get and Manage Add-Ons.

[masks,labels] = segmentObjects(detector,I) also returns the labels assigned to the predicted object instance masks.

example

[masks,labels,scores] = segmentObjects(detector,I) also returns the prediction score for each predicted object instance mask.

example

dsResults = segmentObjects(detector,imds) segments objects within images in a datastore using SOLOv2 instance segmentation. The function returns a datastore with the instance segmentation results, including the instance masks, labels, prediction scores, and bounding boxes.

example

[___] = segmentObjects(___,Name=Value) specifies options using additional name-value arguments in addition to any combination of arguments from previous syntaxes.. For example, Threshold=0.9 specifies the confidence threshold as 0.9.

Examples

collapse all

Create a pretrained SOLOv2 instance segmentation network.

model = solov2("light-resnet18-coco");

Read a test image that includes objects that the network can detect, such as dogs, into the workspace.

I = imread("kobi.png");

Segment instances of objects in the image using the SOLOv2 instance segmentation model.

[masks,labels,scores] = segmentObjects(model,I);

Display the instance segmentation results. Overlay the detected object instance mask on the test image.

overlayedImage = insertObjectMask(I,masks);
imshow(overlayedImage)

Figure contains an axes object. The axes object contains an object of type image.

Load a pretrained SOLOv2 instance segmentation network.

model = solov2("resnet50-coco");

Create a datastore of test images.

imageFiles = fullfile(toolboxdir("vision"),"visiondata","visionteam*.jpg");
dsTest = imageDatastore(imageFiles);

Segment instances of objects using the SOLOv2 instance segmentation model.

dsResults = segmentObjects(model,dsTest,Threshold=0.55);
Running SoloV2 network
--------------------------
* Processed 2 images.

For each test image, display the instance segmentation results. Overlay the detected object masks on the test image.

while(hasdata(dsResults))
    testImage = read(dsTest);
    results = read(dsResults);
    maskColors = lines(numel(results{2}));
    figure
    overlayedImage = insertObjectMask(testImage,results{1},Color=maskColors);
    imshow(overlayedImage)
end

Figure contains an axes object. The axes object contains an object of type image.

Figure contains an axes object. The axes object contains an object of type image.

Input Arguments

collapse all

SOLOv2 instance segmentation model, specified as a solov2 object.

Image or batch of images on which to perform instance segmentation, specified as one of these values.

Image TypeData Format
Single grayscale image2-D matrix of size H-by-W
Single color image3-D array of size H-by-W-by-3.
Batch of B grayscale or color images4-D array of size H-by-W-by-C-by-B. The number of color channels C is 1 for grayscale images and 3 for color images.

The height H and width W of each image must be greater than or equal to the input height h and width w of the network.

Datastore of images, specified as a datastore such as an ImageDatastore or CombinedDatastore object. If calling the datastore with the read function returns a cell array, then the image data must be in the first cell.

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Example: segmentObjects(detector,I,Threshold=0.9) specifies the confidence threshold as 0.9.

Options for All Image Formats

collapse all

Confidence threshold, specified as a numeric scalar in the range [0, 1]. The segmentObjects function filters out predictions with confidence scores less than the threshold value. Increase this value to reduce the number of false positives, at the possible expense of missing some true positives.

Mask probability threshold, specified as a numeric scalar in the range [0, 1]. The mask probability threshold is the threshold value for the mask probabilities, determined by an output activation function, that separate object mask pixels from background pixels. If the threshold is too high, the function might incorrectly classify some foreground object pixels as background pixels, reducing the accuracy of the segmentation.

Select the strongest mask prediction for each segmented object instance using non-maximum suppression, specified as a numeric or logical 1 (true) or 0 (false).

  • true — Return the strongest object mask prediction per object. The segmentObjects function selects these predictions by using non-maximum suppression to eliminate overlapping bounding boxes based on their confidence scores.

  • false — Return all predictions. You can then create a custom operation to eliminate overlapping object masks.

Network acceleration type to use for performance optimization, specified as one of these options:

  • "auto" — Automatically select optimizations suitable for the input network and environment.

  • "mex" — Compile and execute a MEX function. This option is available when using a GPU only. Using a GPU requires Parallel Computing Toolbox™ and a CUDA® enabled NVIDIA® GPU. If Parallel Computing Toolbox or a suitable GPU is not available, then the function returns an error. For information about the supported compute capabilities, see GPU Computing Requirements (Parallel Computing Toolbox).

  • "none" — Disable all acceleration.

Use network acceleration to improve performance when using the same instance segmentation network and segmentation parameters across multiple image inputs, at the expense of additional overhead on the initial function call, and a possible increase in memory usage.

Hardware resource on which to process images with the network, specified as one of the execution environment options in this table.

ExecutionEnvironmentDescription
"auto"Use a GPU if available. Otherwise, use the CPU. The use of a GPU requires Parallel Computing Toolbox and a CUDA enabled NVIDIA GPU. For information about the supported compute capabilities, see GPU Computing Requirements (Parallel Computing Toolbox).
"gpu"Use the GPU. If a suitable GPU is not available, the function returns an error message. Using a GPU requires Parallel Computing Toolbox and a CUDA enabled NVIDIA GPU. If Parallel Computing Toolbox or a suitable GPU is not available, then the function returns an error. For information about the supported compute capabilities, see GPU Computing Requirements (Parallel Computing Toolbox).
"cpu"Use the CPU.

Options for Datastore Inputs

collapse all

Number of observations returned in each batch, specified as a positive integer. If you set a higher MiniBatchSize, segmentation requires more memory, which can cause errors if your system does not have sufficient memory.

You can specify this argument only when you specify a datastore of images, imds, as an input to the segmentObjects function.

Location to store writable data, specified as a string scalar or character vector. The specified folder must have write permissions. If the folder already exists, the segmentObjects function creates a new folder and adds a suffix to the folder name with the next available number. The default write location is fullfile(pwd,"SegmentObjectResults"), where pwd is the current working directory.

You can specify this argument only when you specify a datastore of images, imds, as an input to the segmentObjects function.

Data Types: char | string

Prefix added to written filenames, specified as a string scalar or character vector. The function names the output files NamePrefix_imageName.mat, where imageName is the name of the input image without its file extension.

You can specify this argument only when you specify a datastore of images, imds.

Data Types: char | string

Visible progress display, specified as a numeric or logical 1 (true) or 0 (false).

You can specify this argument only when you specify a datastore of images, imds.

Output Arguments

collapse all

Object masks, returned as an H-by-W-by-M logical array for a single image or a B-by-1 cell array for a batch of B images. H and W are the height and width, respectively, of the input image I, and M is the number of objects masks predicted in the image. Each of the M channels contains the mask for a single predicted object instance.

For a batch of B images, each cell of the B-by-1 cell array contains an H-by-W-by-M array of object masks the corresponding image from the batch.

Objects labels, returned as an M-by-1 categorical vector for a single image or a B-by-1 cell array for a batch of B images. M is the number of predicted object instances in the input image I.

For a batch of B images, each cell of the B-by-1 cell array contains an M-by-1 categorical vector with the labels of the objects in the corresponding image from the batch.

Prediction confidence scores, returned as an M-by-1 numeric vector for a single image or a B-by-1 cell array for a batch of B images. M is the number of predicted object instances in the input image I. A higher score indicates higher confidence in the object instance segmentation.

For a batch of B images, each cell of the B-by-1 cell array contains an M-by-1 numeric vector with the confidence scores for the object segmentation predictions in the corresponding image from the batch.

Predicted instance segmentation results, returned as a FileDatastore object. The function organizes the datastore so that calling the read and readall functions on it returns a cell array with three columns. This table describes the format of each cell in each column.

maskslabelsscores

Binary masks, returned as a logical array of size H-by-W-by-M, where M is the number of predicted object instances in the corresponding image. Each mask is the segmentation of one object instance in the image.

Object class names, returned as an M-by-1 categorical vector, where M is the number of predicted object instances in the corresponding image. All categorical data returned by the datastore contains the same categories.

Prediction scores, returned as an M-by-1 numeric vector, where M is the number of predicted object instances in the corresponding image.

Version History

Introduced in R2023b