deeplabv3plus

Create DeepLab v3+ convolutional neural network for semantic image segmentation

Since R2024a

Syntax

deepLabNetwork = deeplabv3plus(imageSize,numClasses,network)

deepLabNetwork = deeplabv3plus(___,DownsamplingFactor=value)

Description

deepLabNetwork = deeplabv3plus(imageSize,numClasses,network) returns a DeepLab v3+ layer with the specified base network, number of classes, and image size.

example

deepLabNetwork = deeplabv3plus(___,DownsamplingFactor=value) additionally sets the downsampling factor (output stride) [1] to either 8 or 16. The downsampling factor sets the amount the encoder section of DeepLab v3+ downsamples the input image.

Examples

collapse all

Create DeepLab v3+ Network Based on ResNet-18

This example uses:

Open Live Script

Create a DeepLab v3+ network based on ResNet-18.

imageSize = [480 640 3];
numClasses = 5;
network = "resnet18";
net = deeplabv3plus(imageSize,numClasses,network, ...
             DownsamplingFactor=16);

Display the network.

analyzeNetwork(net)

Train DeepLab v3+ Network

This example uses:

Open Live Script

Load the triangle data set images using an image datastore. The datastore contains 200 grayscale images of random triangles. Each image is 32-by-32.

dataSetDir = fullfile(toolboxdir("vision"),"visiondata","triangleImages");
imageDir = fullfile(dataSetDir,"trainingImages");
imds = imageDatastore(imageDir);

Load the triangle data set pixel labels using a pixel label datastore.

labelDir = fullfile(dataSetDir, "trainingLabels");
classNames = ["triangle","background"];
labelIDs   = [255 0];
pxds = pixelLabelDatastore(labelDir,classNames,labelIDs);

Create a DeepLab v3+ network.

imageSize = [256 256];
numClasses = numel(classNames);
net = deeplabv3plus(imageSize,numClasses,"resnet18");

Combine image and pixel label data for training and apply a preprocessing transform to resize the training images.

cds = combine(imds,pxds);
tds = transform(cds, @(data)preprocessTrainingData(data,imageSize));

Specify training options. Lower the mini-batch size to reduce memory usage.

opts = trainingOptions("sgdm",...
    MiniBatchSize=8,...
    MaxEpochs=3);

Train the network.

net = trainnet(tds,net,"crossentropy",opts);

    Iteration    Epoch    TimeElapsed    LearnRate    TrainingLoss
    _________    _____    ___________    _________    ____________
            1        1       00:00:04         0.01         0.93844
           50        2       00:04:09         0.01        0.033749
           75        3       00:05:35         0.01        0.026353
Training stopped: Max epochs completed

Read a test image.

I = imread("triangleTest.jpg");

Resize the test image by a factor equal to the input image size divided by 32 so that the triangles in the test image are roughly equal to the size of the triangles during training.

I = imresize(I,Scale=imageSize./32);

Segment the image.

C = semanticseg(I,net);

Display the results.

B = labeloverlay(I,C);
figure
imshow(B)

Figure contains an axes object. The axes object contains an object of type image.

Supporting Functions

function data = preprocessTrainingData(data, imageSize)
% Resize the training image and associated pixel label image.
data{1} = imresize(data{1},imageSize);
data{2} = imresize(data{2},imageSize);

% Convert grayscale input image into RGB for use with ResNet-18, which
% requires RGB image input.
data{1} = repmat(data{1},1,1,3);
end

Input Arguments

collapse all

`imageSize` — Network input image size
2-element vector | 3-element vector

Network input image size, specified as a:

2-element vector in the format [height, width].
3-element vector in the format [height, width, 3]. The third element, 3, corresponds to RGB.

`numClasses` — Number of classes
integer greater than 1

Number of classes for network to classify, specified as an integer greater than 1.

`network` — Base network
`'resnet18'` | `'resnet50'` | `'mobilenetv2'` | `'xception'` | `'inceptionresnetv2'`

Base network, specified as resnet18 (Deep Learning Toolbox), resnet50 (Deep Learning Toolbox), mobilenetv2 (Deep Learning Toolbox), xception (Deep Learning Toolbox), or inceptionresnetv2 (Deep Learning Toolbox). You must install the corresponding network add-on.

Output Arguments

collapse all

`deepLabNetwork` — DeepLab v3+ network
`dlnetwork` object

DeepLab v3+ network, returned as a dlnetwork (Deep Learning Toolbox) object for semantic image segmentation. The network uses encoder-decoder architecture, dilated convolutions, and skip connections to segment images. You must use the trainnet (Deep Learning Toolbox) function (requires Deep Learning Toolbox™) to train the network before you can use the network for semantic segmentation.

Algorithms

When you use either the xception (Deep Learning Toolbox) or mobilenetv2 (Deep Learning Toolbox) base networks to create a DeepLab v3+ network, depth separable convolutions are used in the atrous spatial pyramid pooling (ASPP) and decoder subnetworks. For all other base networks, convolution layers are used.
This implementation of DeepLab v3+ does not include a global average pooling layer in the ASPP.

References

[1] Chen, L., Y. Zhu, G. Papandreou, F. Schroff, and H. Adam. "Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation." Computer Vision — ECCV 2018, 833-851. Munic, Germany: ECCV, 2018.

Extended Capabilities

GPU Code Generation
Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

Usage notes and limitations:

For code generation, you must first create a DeepLab v3+ network by using the deeplabv3plus function. Then, use the trainnet (Deep Learning Toolbox) function on the resulting dlnetwork object to train the network for segmentation. Once the network is trained and evaluated, you can generate code for the deep learning network object using GPU Coder™.

Version History

Introduced in R2024a

deeplabv3plus

Syntax

Description

Examples

Create DeepLab v3+ Network Based on ResNet-18

Train DeepLab v3+ Network

Input Arguments

`imageSize` — Network input image size
2-element vector | 3-element vector

`numClasses` — Number of classes
integer greater than 1

`network` — Base network
`'resnet18'` | `'resnet50'` | `'mobilenetv2'` | `'xception'` | `'inceptionresnetv2'`

Output Arguments

`deepLabNetwork` — DeepLab v3+ network
`dlnetwork` object

Algorithms

References

Extended Capabilities

GPU Code Generation
Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

Version History

See Also

Objects

Functions

Topics

deeplabv3plus

Syntax

Description

Examples

Create DeepLab v3+ Network Based on ResNet-18

Train DeepLab v3+ Network

Input Arguments

imageSize — Network input image size 2-element vector | 3-element vector

numClasses — Number of classes integer greater than 1

network — Base network 'resnet18' | 'resnet50' | 'mobilenetv2' | 'xception' | 'inceptionresnetv2'

Output Arguments

deepLabNetwork — DeepLab v3+ network dlnetwork object

Algorithms

References

Extended Capabilities

GPU Code Generation Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

Version History

See Also

Objects

Functions

Topics

`imageSize` — Network input image size
2-element vector | 3-element vector

`numClasses` — Number of classes
integer greater than 1

`network` — Base network
`'resnet18'` | `'resnet50'` | `'mobilenetv2'` | `'xception'` | `'inceptionresnetv2'`

`deepLabNetwork` — DeepLab v3+ network
`dlnetwork` object

GPU Code Generation
Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.