segmentObjectsFromEmbeddings

Segment objects in image using Segment Anything Model (SAM) feature embeddings

Since R2024a

collapse all in page

Syntax

masks = segmentObjectsFromEmbeddings(sam,embeddings,imageSize,ForegroundPoints=pointPrompt)

[masks] = segmentObjectsFromEmbeddings(sam,embeddings,imageSize,BoundingBox=boxPrompt)

[masks,scores,maskLogits] = segmentObjectsFromEmbeddings(___)

[___] = segmentObjectsFromEmbeddings(___,Name=Value)

Description

masks = segmentObjectsFromEmbeddings(sam,embeddings,imageSize,ForegroundPoints=pointPrompt) segments objects from an image of size imageSize using the SAM feature embeddings embeddings and the foreground point coordinates pointPrompt as a visual prompt.

Note

This functionality requires Deep Learning Toolbox™, Computer Vision Toolbox™, and the Image Processing Toolbox™ Model for Segment Anything Model. You can install the Image Processing Toolbox Model for Segment Anything Model from Add-On Explorer. For more information about installing add-ons, see Get and Manage Add-Ons.

example

[masks] = segmentObjectsFromEmbeddings(sam,embeddings,imageSize,BoundingBox=boxPrompt) segments objects from an image using bounding box coordinates boxPrompt as a visual prompt.

[masks,scores,maskLogits] = segmentObjectsFromEmbeddings(___) returns the scores corresponding to each predicted object mask and the prediction mask logits maskLogits, using any combination of input arguments from previous syntaxes.

[___] = segmentObjectsFromEmbeddings(___,Name=Value) specifies options using one or more name-value arguments in addition to any combination of arguments from previous syntaxes. For example, ReturnMultiMask=true specifies to return three masks for a segmented object.

Examples

collapse all

Interactively Segment Image Using Segment Anything Model

This example uses:

Open Live Script

Create a Segment Anything Model (SAM) object for image segmentation.

sam = segmentAnythingModel;

Load an image that contains the object to segment into the workspace.

I = imread("pears.png");

Define the image size.

imageSize = size(I);

Extract the feature embeddings from the image.

embeddings = extractEmbeddings(sam,I);

Specify the visual prompts for semantic segmentation of a single object from the image using coordinates of foreground points, or points inside of the object to segment, and coordinates of background points, or points outside of the object to segment.

foregroundPoints = [512 400; 480 420];
backgroundPoints = [340 300];

Segment an object in the image using SAM segmentation, and return the mask and corresponding prediction score.

masks = segmentObjectsFromEmbeddings(sam,embeddings,imageSize, ...
    ForegroundPoints=foregroundPoints,BackgroundPoints=backgroundPoints);

Overlay the detected object mask on the test image.

imMask = insertObjectMask(I,masks);
imshow(imMask)

Display the foreground (green) and background (red) points used as visual prompts.

fx = foregroundPoints(:,1);
fy = foregroundPoints(:,2);
bx = backgroundPoints(:,1);
by = backgroundPoints(:,2);
hold on
plot(fx,fy,'g*',bx,by,'r*')
hold off

Figure contains an axes object. The hidden axes object contains 3 objects of type image, line. One or more of the lines displays its values using only markers

Input Arguments

collapse all

`sam` — Segment Anything Model
`segmentAnythingModel` object

Segment Anything Model for semantic segmentation, specified as a segmentAnythingModel object.

`embeddings` — Image embeddings
64-by-64-by-256 array

Image embeddings, specified as a 64-by-64-by-256 array. Generate the embeddings for an image or a batch of images using the extractEmbeddings object function.

`imageSize` — Size of image
1-by-3 vector | 1-by-2 vector

Size of the input image used to generate the embeddings, specified as a 1-by-3 vector of positive integers of the form [height width channels] or a 1-by-2 vector of positive integers of the form [height width], in pixels.

`pointPrompt` — Points of object to be segmented
`[]` (default) | P-by-2 matrix

Points of the object to be segmented, or foreground points, specified as a P-by-2 matrix. Each row specifies the xy-coordinates of a point in the form [x y]. P is the number of points.

Note

Use at least one of these options as the visual prompts for interactive segmentation: foreground points coordinates specified by pointPrompt, or the object bounding box coordinates specified by boxPrompt, in addition to optional name-value arguments.

`boxPrompt` — Rectangular bounding box
`[]` (default) | 1-by-4 vector

Rectangular bounding box that contains the object to be segmented, specified as a 1-by-4 vector of the form [x y width height]. The coordinates x and y specify the center of the box, and width and height are the width and height of the box, respectively.

Note

Use at least one of these options as the visual prompts for interactive segmentation: object bounding box coordinates specified by boxPrompt, or the foreground points coordinates specified by pointPrompt, in addition to optional name-value arguments.

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Example: segmentObjectsFromEmbeddings(sam,embeddings,imageSize,ForegroundPoints=pointPrompt,BoundingBox=boxPrompt,BackgroundPoints=MyPoints) specifies the background point coordinates visual prompt as the array MyPoints.

`BackgroundPoints` — Background points
`[]` (default) | P-by-2 matrix

Background points, specified as a P-by-2 array. Each row specifies the xy-coordinates of a point in the form [x y]. P is the number of points. Use this argument to specify points in the image that are not part of the object to be segmented, as an additional visual prompt to foreground points or bounding boxes.

`MaskLogits` — Mask prediction logits
`[]` (default) | 256-by-256-by-1 numeric array

Mask prediction logits, specified by the value of maskLogits from the previous output of the segmentObjectsFromEmbeddings function. Specify the MaskLogits argument to refine an existing mask.

`ReturnMultiMask` — Multiple segmentation masks
`false` or `0` (default) | `true` or `1`

Multiple segmentation masks, specified as a numeric or logical 0 (false) or 1 (true). Specify ReturnMultiMask as true to return three masks in place of the default single mask, where each mask is a page of an H-by-W-by-3 logical array. H and W are the height and width, respectively, of the input image I.

Use this argument to return three masks when you use ambiguous visual prompts, such as single points. You can choose one or a combination of the resulting masks to capture different sub-regions of the object.

Output Arguments

collapse all

`masks` — Object masks
H-by-W logical matrix | H-by-W-by-3 logical array

Object masks, returned as one of these options:

H-by-W logical – ReturnMultiMask is 0 (false).
H-by-W-by-3 logical array – ReturnMultiMask is 1 (true).

H and W are the height and width, respectively, of the input image I.

`scores` — Prediction scores
numeric scalar | 1-by-3 numeric vector

Prediction confidence scores for the segmentation, returned as one of these options:

Numeric scalar – ReturnMultiMask is 0 (false).
1-by-3 numeric vector – ReturnMultiMask is 1 (true).

`maskLogits` — Mask prediction logits
256-by-256 numeric matrix | 256-by-256-by-3 numeric array

Mask prediction logits, returned as one of these options:

256-by-256 numeric matrix – ReturnMultiMask value is 0 (false).
256-by-256-by-3 numeric array – ReturnMultiMask value is 1 (true).

Mask logits are raw, unnormalized predictions generated by the model for each pixel in the image, representing the probability that the pixel belongs to a particular instance or object class.

You can specify this value to the MaskLogits name-value argument on subsequent segmentObjectsFromEmbeddings function calls to refine the output mask.

References

[1] Kirillov, Alexander, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, et al. "Segment Anything," April 5, 2023. https://doi.org/10.48550/arXiv.2304.02643.

Version History

Introduced in R2024a

segmentObjectsFromEmbeddings

Syntax

Description

Examples

Interactively Segment Image Using Segment Anything Model

Input Arguments

`sam` — Segment Anything Model
`segmentAnythingModel` object

`embeddings` — Image embeddings
64-by-64-by-256 array

`imageSize` — Size of image
1-by-3 vector | 1-by-2 vector

`pointPrompt` — Points of object to be segmented
`[]` (default) | P-by-2 matrix

`boxPrompt` — Rectangular bounding box
`[]` (default) | 1-by-4 vector

Name-Value Arguments

`BackgroundPoints` — Background points
`[]` (default) | P-by-2 matrix

`MaskLogits` — Mask prediction logits
`[]` (default) | 256-by-256-by-1 numeric array

`ReturnMultiMask` — Multiple segmentation masks
`false` or `0` (default) | `true` or `1`

Output Arguments

`masks` — Object masks
H-by-W logical matrix | H-by-W-by-3 logical array

`scores` — Prediction scores
numeric scalar | 1-by-3 numeric vector

`maskLogits` — Mask prediction logits
256-by-256 numeric matrix | 256-by-256-by-3 numeric array

References

Version History

See Also

Topics

segmentObjectsFromEmbeddings

Syntax

Description

Examples

Interactively Segment Image Using Segment Anything Model

Input Arguments

sam — Segment Anything Model segmentAnythingModel object

embeddings — Image embeddings 64-by-64-by-256 array

imageSize — Size of image 1-by-3 vector | 1-by-2 vector

pointPrompt — Points of object to be segmented [] (default) | P-by-2 matrix

boxPrompt — Rectangular bounding box [] (default) | 1-by-4 vector

Name-Value Arguments

BackgroundPoints — Background points [] (default) | P-by-2 matrix

MaskLogits — Mask prediction logits [] (default) | 256-by-256-by-1 numeric array

ReturnMultiMask — Multiple segmentation masks false or 0 (default) | true or 1

Output Arguments

masks — Object masks H-by-W logical matrix | H-by-W-by-3 logical array

scores — Prediction scores numeric scalar | 1-by-3 numeric vector

maskLogits — Mask prediction logits 256-by-256 numeric matrix | 256-by-256-by-3 numeric array

References

Version History

See Also

Topics

`sam` — Segment Anything Model
`segmentAnythingModel` object

`embeddings` — Image embeddings
64-by-64-by-256 array

`imageSize` — Size of image
1-by-3 vector | 1-by-2 vector

`pointPrompt` — Points of object to be segmented
`[]` (default) | P-by-2 matrix

`boxPrompt` — Rectangular bounding box
`[]` (default) | 1-by-4 vector

`BackgroundPoints` — Background points
`[]` (default) | P-by-2 matrix

`MaskLogits` — Mask prediction logits
`[]` (default) | 256-by-256-by-1 numeric array

`ReturnMultiMask` — Multiple segmentation masks
`false` or `0` (default) | `true` or `1`

`masks` — Object masks
H-by-W logical matrix | H-by-W-by-3 logical array

`scores` — Prediction scores
numeric scalar | 1-by-3 numeric vector

`maskLogits` — Mask prediction logits
256-by-256 numeric matrix | 256-by-256-by-3 numeric array