yolov4ObjectDetector

Detect objects using YOLO v4 object detector

Since R2022a

Description

The yolov4ObjectDetector object creates a you only look once version 4 (YOLO v4) one-stage object detector for detecting objects in an image. Using this object, you can:

Create a pretrained YOLO v4 object detector by using YOLO v4 deep learning networks trained on COCO dataset.
Create a custom YOLO v4 object detector by using any pretrained or untrained YOLO v4 deep learning network.

Creation

Syntax

detector = yolov4ObjectDetector(name)

detector = yolov4ObjectDetector(name,classes,aboxes)

detector = yolov4ObjectDetector(net,classes,aboxes)

detector = yolov4ObjectDetector(baseNet,classes,aboxes,DetectionNetworkSource=layer)

detector = yolov4ObjectDetector(___,Name=Value)

Description

Pretrained YOLO v4 Object Detector

detector = yolov4ObjectDetector(name) creates a pretrained YOLO v4 object detector by using YOLO v4 deep learning networks trained on a COCO dataset.

example

Custom YOLO v4 Object Detector

detector = yolov4ObjectDetector(name,classes,aboxes) creates a pretrained YOLO v4 object detector and configures it to perform transfer learning using a specified set of object classes and anchor boxes. For optimal results, you must train the detector on new training images before performing detection. Use the trainYOLOv4ObjectDetector function for training the detector.

detector = yolov4ObjectDetector(net,classes,aboxes) creates an object detector by using the deep learning network net.

If net is a pretrained YOLO v4 deep learning network, the function creates a pretrained YOLO v4 object detector. The classes and aboxes are values used for training the network.

If net is an untrained YOLO v4 deep learning network, the function creates a YOLO v4 object detector to use for training and inference. classes and aboxes specify the object classes and the anchor boxes, respectively, for training the YOLO v4 network.

Use the trainYOLOv4ObjectDetector function to train the network before performing object detection.

detector = yolov4ObjectDetector(baseNet,classes,aboxes,DetectionNetworkSource=layer) creates a YOLO v4 object detector by adding detection heads to a base network, baseNet.

The function adds detection heads to the specified feature extraction layers layer in the base network. To specify the names of the feature extraction layers, use the name-value argument DetectionNetworkSource=layer.

If baseNet is a pretrained deep learning network, the function creates a YOLO v4 object detector and configures it to perform transfer learning with the specified object classes and anchor boxes.

If baseNet is an untrained deep learning network, the function creates a YOLO v4 object detector and configures it for object detection. classes and aboxes specify the object classes and the anchor boxes, respectively, for training the YOLO v4 network.

You must train the detector on a training dataset before performing object detection. Use the trainYOLOv4ObjectDetector function for training the detector.

example

detector = yolov4ObjectDetector(___,Name=Value) specifies one or more options using name-value arguments in addition to any combination of input arguments from previous syntaxes. Use this syntax to:

Modify the detection network sources in a YOLO v4 object detection network and train the network with different numbers of object classes, anchor boxes, or both. Specify the new detection network sources using the DetectionNetworkSource=layer name-value argument.
Set the InputSize and ModelName properties of the object detector. For example, InputSize=[224 224 3] sets the size of the images used for training to [224 224 3].

Note

To use the pretrained YOLO v4 object detection networks trained on COCO dataset, you must install the Computer Vision Toolbox™ Model for YOLO v4 Object Detection. You can download and install the Computer Vision Toolbox Model for YOLO v4 Object Detection from Add-On Explorer. For more information about installing add-ons, see Get and Manage Add-Ons. To run this function, you will require the Deep Learning Toolbox™.

Input Arguments

expand all

`name` — Name of pretrained YOLO v4 deep learning network
`"csp-darknet53-coco"` | `"tiny-yolov4-coco"`

Name of the pretrained YOLO v4 deep learning network, specified as one of these:

"csp-darknet53-coco" — A pretrained YOLO v4 deep learning network created using CSP-DarkNet-53 as the base network and trained on COCO dataset.
"tiny-yolov4-coco" — A pretrained YOLO v4 deep learning network created using a small base network and trained on COCO dataset.

Data Types: char | string

`classes` — Names of object classes
string vector | cell array of character vectors | categorical vector

Names of object classes for training the detector, specified as a string vector, cell array of character vectors, or categorical vector. This argument sets the ClassNames property of the yolov4ObjectDetector object.

Data Types: char | string | categorical

`aboxes` — Anchor boxes
cell array

Anchor boxes for training the detector, specified as an N-by-1 column vector cell array or a 1-by-N row vector cell array. N is the number of output layers in the YOLO v4 deep learning network. Each cell contains an M-by-2 matrix, where M is the number of anchor boxes in that layer. Each cell can contain a different number of anchor boxes. Each row in the M-by-2 matrix denotes the size of an anchor box in the form [height width].

The first element in the cell array specifies the anchor boxes to associate with the first output layer, the second element in the cell array specifies the anchor boxes to associate with the second output layer, and so on. For accurate detection results, specify large anchor boxes for the first output layer and small anchor boxes for the last output layer. That is, the anchor box sizes must decrease for each output layer in the order in which the layers appear in the YOLO v4 deep learning network.

This argument sets the AnchorBoxes property of the yolov4ObjectDetector object.

Data Types: cell

`net` — YOLO v4 deep learning network
`dlnetwork` object

YOLO v4 deep learning network, specified as a dlnetwork (Deep Learning Toolbox) object. The input network can be either an untrained or a pretrained deep learning network. The input network must have an image input layer.

`baseNet` — Base network
`dlnetwork` object

Base network for creating the YOLO v4 deep learning network, specified as a dlnetwork (Deep Learning Toolbox) object. The network can be either an untrained or a pretrained deep learning network. The input network must have an image input layer.

`layer` — Names of feature extraction layers
cell array of character vectors | string array

Names of the feature extraction layers in the base network, specified as a cell array of character vectors, or a string array. The function creates a YOLO v4 network by adding detection head layers to the output of the feature extraction layers in the base network.

In the pre-trained yolo v4 base network, you must connect the detection head layers to feature extraction layers where the input image feature map changes spatial dimension by downsampling factors of 4, 8, 16, and 32.

Use the analyzeNetwork (Deep Learning Toolbox) function to display the pre-trained YOLO v4 network architecture to obtain the information about the spatial dimensions of the feature maps.

Use this table to know about the default feature extraction layers and choose alternate feature extraction layers supported by the pre-trained YOLO v4 networks. Choose feature extraction layers using empirical evaluation based on the detection accuracy, training speed, and object size to detect.

To achieve higher detection accuracy, choose feature extraction layers with both higher and lower downsampling factors. Layers with lower downsampling factor capture small objects and layers with higher downsampling factor ensure a thorough feature refinement.
To achieve higher training and inference speeds, choose feature extraction layers with lower downsampling factor.
The number of detection network sources specified must be equal to the number of anchor boxes specified at the input.

Backbone	Number of Detection Heads	Default Feature Extraction Layers		Examples of Alternate Feature Extraction Layers		Remarks
		Layer	Downsampling Factor	Layer	Downsampling Factor
`tiny-yolov4-coco`	2	`leaky_28` `leaky_25`	32 16	`leaky_17` `leaky_9`	8 4	Select any two feature extraction layers with different downsampling factors preceding the `leaky_28` layer to use as detection network sources. For example, specify the new detection network sources as `layer = {'leaky_17','leaky_9'}`
`csp-darknet53-coco`	3	`mish_106` `mish_87` `mish_56`	32 16 8	`mish_88` `mish_102`	32	Select any three feature extraction layers with different downsampling factors preceding the `mish_106` layer to use as detection network sources. For example, specify the new detection network sources as `layer = {'mish_57','mish_26','mish_13'}`
				`mish_68` `mish_57`	16
				`mish_37` `mish_26`	8
				`mish_25` `mish_18` `mish_13`	4

Data Types: char | string | cell

Properties

expand all

`Network` — YOLO v4 deep learning network
`dlnetwork` object

YOLO v4 deep learning network to use for object detection, stored as a dlnetwork (Deep Learning Toolbox) object.

`ClassNames` — Names of object classes
categorical vector

Names of object classes to detect, stored as a categorical vector. You can set this property by using the input argument classes.

`AnchorBoxes` — Set of anchor boxes
Read-only: N-by-1 cell array

This property is read-only.

Set of anchor boxes, stored as a N-by-1 cell array. N is the number of output layers in the YOLO v4 deep learning network for which the anchor boxes are defined. Each element in the cell is a M-by-2 matrix. M denotes the number of anchor boxes. Each cell can contain a different number of anchor boxes. Each row in the M-by-2 matrix denotes the size of the anchor box in the form of [height width]. The first element in the cell array specifies the anchor boxes for the first output layer, the second element in the cell array specifies the anchor boxes for the second output layer, and so on.

You can set this property by using the input argument aboxes.

`InputSize` — Image size used for training
Read-only: vector

This property is read-only.

Image size used for training, stored as a vector of form [height width] or [height width channels]. To set this property, specify it at object creation. The size of the training images must be a multiple of 32.

For example, detector = yolov4ObjectDetector(net,classes,aboxes,InputSize=[224 224 3]).

`ModelName` — Name for object detector
`' '` (default) | character vector | string scalar

Name for the object detector, stored as a character vector or string scalar. To set this property, specify it at object creation.

For example, yolov4ObjectDetector(net,classes,aboxes,ModelName="customDetector") sets the name for the object detector to "customDetector".

`PredictedBoxType` — Bounding box format for object detector
Read-only: `"axis-aligned"` (default) | `"rotated"`

This property is read-only.

Bounding box format for an object detector, stored as "axis-aligned" or "rotated". When the PredictedBoxType is "axis-aligned", the object detector will train and perform inference on only axis-aligned bounding boxes. If it is set to "rotated", the object detector will train and perform inference on only rotated bounding boxes. Set this property when you create the object.

Object Functions

detect Detect objects using YOLO v4 object detector

Examples

collapse all

Create Pretrained YOLO v4 Object Detector

This example uses:

Open Live Script

Specify the name of a pretrained YOLO v4 deep learning network.

name = "tiny-yolov4-coco";

Create YOLO v4 object detector by using the pretrained YOLO v4 network.

detector = yolov4ObjectDetector(name);

Display and inspect the properties of the YOLO v4 object detector.

disp(detector)

  yolov4ObjectDetector with properties:

        Network: [1×1 dlnetwork]
    AnchorBoxes: {2×1 cell}
     ClassNames: {80×1 cell}
      InputSize: [416 416 3]
      ModelName: 'tiny-yolov4-coco'

Use analyzeNetwork to display the YOLO v4 network architecture and get information about the network layers.

analyzeNetwork(detector.Network)

Detect objects in an unknown image by using the pretrained YOLO v4 object detector.

img = imread("highway.png");
[bboxes,scores,labels] = detect(detector,img);

Display the detection results.

detectedImg = insertObjectAnnotation(img,"Rectangle",bboxes,labels);
figure
imshow(detectedImg)

Create Custom YOLO v4 Object Detector

This example uses:

Open Live Script

This example shows how to create a YOLO v4 object detection network based on a pretrained ResNet-50 convolutional neural network.

Load a pretrained deep learning network to use as the base network. This example uses ResNet-50 pretrained network as the base network. For information about other available pretrained networks, see Pretrained Deep Neural Networks (Deep Learning Toolbox).

basenet = imagePretrainedNetwork("resnet50");

Use analyzeNetwork to display the architecture of the base network.

analyzeNetwork(basenet)

The first layer in the base network is the image input layer. Inspect the property of the image input layer in the base network.

basenet.Layers(1)

ans = 
  ImageInputLayer with properties:

                      Name: 'input_1'
                 InputSize: [224 224 3]
        SplitComplexInputs: 0

   Hyperparameters
          DataAugmentation: 'none'
             Normalization: 'zerocenter'
    NormalizationDimension: 'auto'
                      Mean: [224×224×3 single]

To create a YOLO v4 deep learning network you must set the Normalization property of the ImageInputLayer in the base network to "none". Define an image input layer with the Normalization property set as "none" and other property values the same as those of the base network.

imageSize = basenet.Layers(1).InputSize;
layerName = basenet.Layers(1).Name;
newInputLayer = imageInputLayer(imageSize,Normalization="none",Name=layerName);

Replace the image input layer in the base network with the new input layer.

dlnet = replaceLayer(basenet,layerName,newInputLayer);

Specify the names of the feature extraction layers in the base network to use as the detection heads.

featureExtractionLayers = ["activation_22_relu","activation_40_relu"];

Specify the class names and anchor boxes to use for training the YOLO v4 deep learning network created using ResNet-50 as the base network.

classes = ["car","person"];
anchorBoxes = {[122,177;223,84;80,94];...
               [111,38;33,47;37,18]};

Create a YOLO v4 object detector by using the specified base network and the detection heads.

detector = yolov4ObjectDetector(dlnet,classes,anchorBoxes, ...
    DetectionNetworkSource=featureExtractionLayers);

Display and inspect the properties of the YOLO v4 object detector.

disp(detector)

  yolov4ObjectDetector with properties:

             Network: [1×1 dlnetwork]
         AnchorBoxes: {2×1 cell}
          ClassNames: {2×1 cell}
           InputSize: [224 224 3]
    PredictedBoxType: 'axis-aligned'
           ModelName: ''

Use analyzeNetwork to display the YOLO v4 network architecture and get information about the network layers.

analyzeNetwork(detector.Network)

Detect People Using YOLO v4 Object Detector

This example uses:

Open Live Script

This example shows how to detect people using a pretrained YOLO v4 object detector.

Load the pretrained YOLO v4 detector that can detect 80 common object classes.

detector = yolov4ObjectDetector();

Read image to process.

I = imread("visionteam.jpg");

Run object detector.

[bboxes, scores, labels] = detect(detector, I, Threshold=0.4);

Select detections for the person class.

personBoxes = bboxes(labels=="person", :);

Display results.

detectedImg = insertObjectAnnotation(I, "Rectangle", personBoxes, "person");
figure
imshow(detectedImg)

Figure contains an axes object. The hidden axes object contains an object of type image.

Extended Capabilities

expand all

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

The roi argument to the detect method must be a code generation constant (coder.const()) and a 1x4 vector.
Only the Threshold, SelectStrongest, MinSize, MaxSize, and MiniBatchSize name-value pairs for detect are supported.

GPU Code Generation
Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

The roi argument to the detect method must be a code generation constant (coder.const()) and a 1x4 vector.
Only the Threshold, SelectStrongest, MinSize, MaxSize, and MiniBatchSize name-value pairs for detect are supported.

For information about how to create a yolov4ObjectDetector object for code generation, see Load Pretrained Networks for Code Generation (MATLAB Coder).

GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.

Usage notes and limitations:

GPU Arrays is not supported for rotated rectangle bounding box inputs.

Version History

Introduced in R2022a

expand all

R2024b: Modify detection network source

Starting in R2024b, you can modify the detection network sources of a YOLO v4 object detection network when you perform transfer learning. You can specify the new detection network sources using the name-value argument DetectionNetworkSource=layer.

Specify new detection network sources for the tiny-yolov4-coco and csp-darknet53-coco pretrained YOLO v4 object detectors by using the syntax
```
detector = yolov4ObjectDetector(name,classes,aboxes,DetectionNetworkSource=layer);
```
Specify new detection network sources for a YOLO v4 object detection network by using the syntax
```
detector = yolov4ObjectDetector(net,classes,aboxes,DetectionNetworkSource=layer);
```

For an example, the below code modifies the detection network sources of the pretrained tiny-yolov4-coco network to leaky_23 and leaky_15.

name = "tiny-yolov4-coco";
layer = ["leaky_23","leaky_15"];
detector = yolov4ObjectDetector(name,classes,aboxes,DetectionNetworkSource=layer);

yolov4ObjectDetector

Description

Creation

Syntax

Description

Pretrained YOLO v4 Object Detector

Custom YOLO v4 Object Detector

Input Arguments

`name` — Name of pretrained YOLO v4 deep learning network
`"csp-darknet53-coco"` | `"tiny-yolov4-coco"`

`classes` — Names of object classes
string vector | cell array of character vectors | categorical vector

`aboxes` — Anchor boxes
cell array

`net` — YOLO v4 deep learning network
`dlnetwork` object

`baseNet` — Base network
`dlnetwork` object

`layer` — Names of feature extraction layers
cell array of character vectors | string array

Properties

`Network` — YOLO v4 deep learning network
`dlnetwork` object

`ClassNames` — Names of object classes
categorical vector

`AnchorBoxes` — Set of anchor boxes
Read-only: N-by-1 cell array

`InputSize` — Image size used for training
Read-only: vector

`ModelName` — Name for object detector
`' '` (default) | character vector | string scalar

`PredictedBoxType` — Bounding box format for object detector
Read-only: `"axis-aligned"` (default) | `"rotated"`

Object Functions

Examples

Create Pretrained YOLO v4 Object Detector

Create Custom YOLO v4 Object Detector

Detect People Using YOLO v4 Object Detector

Extended Capabilities

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

GPU Code Generation
Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.

Version History

R2024b: Modify detection network source

See Also

Topics

yolov4ObjectDetector

Description

Creation

Syntax

Description

Pretrained YOLO v4 Object Detector

Custom YOLO v4 Object Detector

Input Arguments

name — Name of pretrained YOLO v4 deep learning network "csp-darknet53-coco" | "tiny-yolov4-coco"

classes — Names of object classes string vector | cell array of character vectors | categorical vector

aboxes — Anchor boxes cell array

net — YOLO v4 deep learning network dlnetwork object

baseNet — Base network dlnetwork object

layer — Names of feature extraction layers cell array of character vectors | string array

Properties

Network — YOLO v4 deep learning network dlnetwork object

ClassNames — Names of object classes categorical vector

AnchorBoxes — Set of anchor boxes Read-only: N-by-1 cell array

InputSize — Image size used for training Read-only: vector

ModelName — Name for object detector ' ' (default) | character vector | string scalar

PredictedBoxType — Bounding box format for object detector Read-only: "axis-aligned" (default) | "rotated"

Object Functions

Examples

Create Pretrained YOLO v4 Object Detector

Create Custom YOLO v4 Object Detector

Detect People Using YOLO v4 Object Detector

Extended Capabilities

C/C++ Code Generation Generate C and C++ code using MATLAB® Coder™.

GPU Code Generation Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

GPU Arrays Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.

Version History

R2024b: Modify detection network source

See Also

Topics

`name` — Name of pretrained YOLO v4 deep learning network
`"csp-darknet53-coco"` | `"tiny-yolov4-coco"`

`classes` — Names of object classes
string vector | cell array of character vectors | categorical vector

`aboxes` — Anchor boxes
cell array

`net` — YOLO v4 deep learning network
`dlnetwork` object

`baseNet` — Base network
`dlnetwork` object

`layer` — Names of feature extraction layers
cell array of character vectors | string array

`Network` — YOLO v4 deep learning network
`dlnetwork` object

`ClassNames` — Names of object classes
categorical vector

`AnchorBoxes` — Set of anchor boxes
Read-only: N-by-1 cell array

`InputSize` — Image size used for training
Read-only: vector

`ModelName` — Name for object detector
`' '` (default) | character vector | string scalar

`PredictedBoxType` — Bounding box format for object detector
Read-only: `"axis-aligned"` (default) | `"rotated"`

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

GPU Code Generation
Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.