vision.CascadeObjectDetector

Detect objects using the Viola-Jones algorithm

Description

The cascade object detector uses the Viola-Jones algorithm to detect people’s faces, noses, eyes, mouth, or upper body. You can also use the Image Labeler to train a custom classifier to use with this System object. For details on how the function works, see Get Started with Cascade Object Detector.

To detect facial features or upper body in an image:

Create the vision.CascadeObjectDetector object and set its properties.
Call the object with arguments, as if it were a function.

To learn more about how System objects work, see What Are System Objects?

Creation

Syntax

detector = vision.CascadeObjectDetector

detector = vision.CascadeObjectDetector(model)

detector = vision.CascadeObjectDetector(XMLFILE)

detector = vision.CascadeObjectDetector(Name,Value)

Description

detector = vision.CascadeObjectDetector creates a detector to detect objects using the Viola-Jones algorithm.

example

detector = vision.CascadeObjectDetector(model) creates a detector configured to detect objects defined by the input character vector, model.

detector = vision.CascadeObjectDetector(XMLFILE) creates a detector and configures it to use the custom classification model specified with the XMLFILE input.

detector = vision.CascadeObjectDetector(Name,Value) sets properties using one or more name-value pairs. Enclose each property name in quotes. For example, detector = vision.CascadeObjectDetector('ClassificationModel','UpperBody')

Properties

expand all

Unless otherwise indicated, properties are nontunable, which means you cannot change their values after calling the object. Objects lock when you call them, and the release function unlocks them.

If a property is tunable, you can change its value at any time.

For more information on changing property values, see System Design in MATLAB Using System Objects.

`ClassificationModel` — Trained cascade classification model
`'FrontalFaceCART'` (default) | `character string`

Trained cascade classification model, specified as a character vector. The ClassificationModel property controls the type of object to detect. By default, the detector is configured to detect faces.

You can set this character vector to an XML file containing a custom classification model, or to one of the valid model character vectors listed below. You can train a custom classification model using the trainCascadeObjectDetector function. The function can train the model using Haar-like features, histograms of oriented gradients (HOG), or local binary patterns (LBP). For details on how to use the function, see Get Started with Cascade Object Detector.

Classification Model	Image Size Used to Train Model	Model Description
`'FrontalFaceCART'`(Default)	[20 20]	Detects faces that are upright and forward facing. This model is composed of weak classifiers, based on the classification and regression tree analysis (CART). These classifiers use Haar features to encode facial features. CART-based classifiers provide the ability to model higher-order dependencies between facial features. [1]
`'FrontalFaceLBP'`	[24 24]	Detects faces that are upright and forward facing. This model is composed of weak classifiers, based on a decision stump. These classifiers use local binary patterns (LBP) to encode facial features. LBP features can provide robustness against variation in illumination. [2]
`'UpperBody'`	[18 22]	Detects the upper-body region, which is defined as the head and shoulders area. This model uses Haar features to encode the details of the head and shoulder region. Because it uses more features around the head, this model is more robust against pose changes, e.g. head rotations/tilts. [3]
`'EyePairBig'` `'EyePairSmall'`	[11 45] [5 22]	Detects a pair of eyes. The `'EyePairSmall'` model is trained using a smaller image. This enables the model to detect smaller eyes than the `'EyePairBig'` model can detect.[4]
`'LeftEye'` `'RightEye'`	[12 18]	Detects the left and right eye separately. These models are composed of weak classifiers, based on a decision stump. These classifiers use Haar features to encode details.[4]
`'LeftEyeCART'` `'RightEyeCART'`	[20 20]	Detects the left and right eye separately. The weak classifiers that make up these models are CART-trees. Compared to decision stumps, CART-tree-based classifiers are better able to model higher-order dependencies. [5]
`'ProfileFace'`	[20 20]	Detects upright face profiles. This model is composed of weak classifiers, based on a decision stump. These classifiers use Haar features to encode face details.
`'Mouth'`	[15 25]	Detects the mouth. This model is composed of weak classifiers, based on a decision stump, which use Haar features to encode mouth details.[4]
`'Nose'`	[15 18]	This model is composed of weak classifiers, based on a decision stump, which use Haar features to encode nose details.[4]

`MinSize` — Size of smallest detectable object
`[]` (default) | two-element vector

Size of smallest detectable object, specified as a two-element vector [height width]. Set this property in pixels for the minimum size region containing an object. The value must be greater than or equal to the image size used to train the model. Use this property to reduce computation time when you know the minimum object size prior to processing the image. When you do not specify a value for this property, the detector sets it to the size of the image used to train the classification model.

For details explaining the relationship between setting the size of the detectable object and the ScaleFactor property, see Algorithms section.

Tunable: Yes

`MaxSize` — Size of largest detectable object
`[]` (default) | two-element vector

Size of largest detectable object, specified as a two-element vector [height width]. Specify the size in pixels of the largest object to detect. Use this property to reduce computation time when you know the maximum object size prior to processing the image. When you do not specify a value for this property, the detector sets it to size(I).

For details explaining the relationship between setting the size of the detectable object and the ScaleFactor property, see the Algorithms section.

`ScaleFactor` — Scaling for multiscale object detection
`1.1` (default) | scalar

Scaling for multiscale object detection, specified as a value greater than 1.0001. The scale factor incrementally scales the detection resolution between MinSize and MaxSize. You can set the scale factor to an ideal value using:

size(I)/(size(I)-0.5)

The detector scales the search region at increments between MinSize and MaxSize using the following relationship:

search region = round((Training Size)*(ScaleFactor^N))

N is the current increment, an integer greater than zero, and Training Size is the image size used to train the classification model.

Tunable: Yes

`MergeThreshold` — Detection threshold
`4` (default) | integer

Detection threshold, specified as an integer. The threshold defines the criteria needed to declare a final detection in an area where there are multiple detections around an object. Groups of colocated detections that meet the threshold are merged to produce one bounding box around the target object. Increasing this threshold may help suppress false detections by requiring that the target object be detected multiple times during the multiscale detection phase. When you set this property to 0, all detections are returned without performing thresholding or merging operation. This property is tunable.

`UseROI` — Use region of interest
`false` (default) | `true`

Use region of interest, specified as false or true. Set this property to true to detect objects within a rectangular region of interest within the input image.

Usage

Syntax

bbox = detector(I)

bbox = detector(I,roi)

detectionResults = detector(ds)

Description

bbox = detector(I) returns an M-by-4 matrix, bbox, that defines M bounding boxes containing the detected objects. The detector performs multiscale object detection on the input image, I.

bbox = detector(I,roi) detects objects within the rectangular search region specified by roi. Set the 'UseROI' property to true to use this syntax.I is a grayscale or truecolor (RGB) image.

detectionResults = detector(ds) detects objects within all the images returned by the read function of the input datastore.

Input Arguments

expand all

`I` — Input image
grayscale | truecolor (RGB)

Input image, specified as grayscale or truecolor (RGB).

`ds` — Datastore
`datastore` object

Datastore, specified as a datastore object containing a collection of images. Each image must be grayscale or RGB. The function processes only the first column of the datastore, which must contain images and must be cell arrays or tables with multiple columns. Therefore, datastore read function must return image data in the first column.

`model` — Classification model
`'FrontalFaceCART'` (default) | `character string`

Classification model, specified as a character vector. The model input describes the type of object to detect. There are several valid model character vectors, such as 'FrontalFaceCART', 'UpperBody', and 'ProfileFace'. See the ClassificationModel property description for a full list of available models.

`XMLFILE` — Custom classification model
XML file

Custom classification model, specified as an XML file. The XMLFILE can be created using the trainCascadeObjectDetector function or OpenCV (Open Source Computer Vision) training functionality. You must specify a full or relative path to the XMLFILE, if it is not on the MATLAB^® path.

`roi` — Rectangular region of interest
four-element vector (default)

Rectangular region of interest within image I, specified as a four-element vector, [x y width height].

Output Arguments

expand all

`bbox` — Detections
M-by-4 matrix (default)

Detections, returned as an M-by-4 element matrix. Each row of the output matrix contains a four-element vector, [x y width height], that specifies in pixels, the upper-left corner and size of a bounding box.

`detectionResults` — Detection results
3-column table

Detection results, returned as a 3-column table with variable names, Boxes, Scores, and Labels. The Boxes column contains M-by-4 matrices, of M bounding boxes for the objects found in the image. Each row contains a bounding box as a 4-element vector in the format [x,y,width,height]. The format specifies the upper-left corner location and size in pixels of the bounding box in the corresponding image.

Object Functions

To use an object function, specify the System object™ as the first input argument. For example, to release system resources of a System object named obj, use this syntax:

release(obj)

expand all

Common to All System Objects

`step`	Run System object algorithm
`release`	Release resources and allow changes to System object property values and input characteristics
`reset`	Reset internal states of System object

Examples

collapse all

Detect Faces in an Image Using the Frontal Face Classification Model

Open Live Script

Create a face detector object.

faceDetector = vision.CascadeObjectDetector;

Read the input image.

I = imread('visionteam.jpg');

Detect faces.

bboxes = faceDetector(I);

Annotate detected faces.

IFaces = insertObjectAnnotation(I,'rectangle',bboxes,'Face');   
figure
imshow(IFaces)
title('Detected faces');

Figure contains an axes object. The hidden axes object with title Detected faces contains an object of type image.

Detect Upper Body in Image Using Upper Body Classification Model

Open Live Script

Create a body detector object and set properties.

bodyDetector = vision.CascadeObjectDetector('UpperBody'); 
bodyDetector.MinSize = [60 60];
bodyDetector.MergeThreshold = 10;

Read input image and detect upper body.

I2 = imread('visionteam.jpg');
bboxBody = bodyDetector(I2);

Annotate detected upper bodies.

IBody = insertObjectAnnotation(I2,'rectangle',bboxBody,'Upper Body');
figure
imshow(IBody)
title('Detected upper bodies');

Figure contains an axes object. The hidden axes object with title Detected upper bodies contains an object of type image.

Algorithms

expand all

Classification Model Training

Each model is trained to detect a specific type of object. The classification models are trained by extracting features from a set of known images. These extracted features are then fed into a learning algorithm to train the classification model. Computer Vision Toolbox™ software uses the Viola-Jones cascade object detector. This detector uses HOG[7], LBP[8], and Haar-like [6] features and a cascade of classifiers trained using boosting.

The image size used to train the classifiers defines the smallest region containing the object. Training image sizes vary according to the application, type of target object, and available positive images. You must set the MinSize property to a value greater than or equal to the image size used to train the model.

Cascade of Classifiers

This object uses a cascade of classifiers to efficiently process image regions for the presence of a target object. Each stage in the cascade applies increasingly more complex binary classifiers, which allows the algorithm to rapidly reject regions that do not contain the target. If the desired object is not found at any stage in the cascade, the detector immediately rejects the region and processing is terminated. By terminating, the object avoids invoking computation-intensive classifiers further down the cascade.

Multiscale Object Detection

The detector incrementally scales the input image to locate target objects. At each scale increment, a sliding window, whose size is the same as the training image size, scans the scaled image to locate objects. The ScaleFactor property determines the amount of scaling between successive increments.

The search region size is related to the ScaleFactor in the following way:

search region = round((ObjectTrainingSize)*(ScaleFactor^N))

N is the current increment, an integer greater than zero, and ObjectTrainingSize is the image size used to train the classification model.

The search window traverses the image for each scaled increment.

Relationship Between MinSize, MaxSize, and ScaleFactor

Understanding the relationship between the size of the object to detect and the scale factor will help you set the properties accordingly. The MinSize and MaxSize properties limit the size range of the object to detect. Ideally, these properties are modified to reduce computation time when you know the approximate object size prior to processing the image. They are not designed to provide precise filtering of results, based on object size. The behavior of these properties is affected by the ScaleFactor. The scale factor determines the quantization of the search window sizes.

search region = round((Training Size)*(ScaleFactor^N))

The actual range of returned object sizes may not be exactly what you select for the MinSize and MaxSize properties. For example,

For a ScaleFactor value of 1.1 with a 24x24 training size, for 5 increments, the search region calculation would be:

>> search region = round(24*1.1.^(1:5))

>> 26 29 32 35 39

If you were to set MaxSize to 34, due to the search region quantization, the actual maximum object size used by the algorithm would be 32.

Merge Detection Threshold

For each increment in scale, the search window traverses over the image producing multiple detections around the target object. The multiple detections are merged into one bounding box per target object. You can use the MergeThreshold property to control the number of detections required before combining or rejecting the detections. The size of the final bounding box is an average of the sizes of the bounding boxes for the individual detections and lies between MinSize and MaxSize.

References

[1] Lienhart R., Kuranov A., and V. Pisarevsky "Empirical Analysis of Detection Cascades of Boosted Classifiers for Rapid Object Detection." Proceedings of the 25th DAGM Symposium on Pattern Recognition. Magdeburg, Germany, 2003.

[2] Ojala Timo, Pietikäinen Matti, and Mäenpää Topi, "Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns". In IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002. Volume 24, Issue 7, pp. 971-987.

[3] Kruppa H., Castrillon-Santana M., and B. Schiele. "Fast and Robust Face Finding via Local Context". Proceedings of the Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, 2003, pp. 157–164.

[4] Castrillón Marco, Déniz Oscar, Guerra Cayetano, and Hernández Mario, "ENCARA2: Real-time detection of multiple faces at different resolutions in video streams". In Journal of Visual Communication and Image Representation, 2007 (18) 2: pp. 130-140.

[5] Yu Shiqi "Eye Detection." Shiqi Yu’s Homepage. http://yushiqi.cn/research/eyedetection.

[6] Viola, Paul and Michael J. Jones, "Rapid Object Detection using a Boosted Cascade of Simple Features" , Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2001. Volume: 1, pp.511–518.

[7] Dalal, N., and B. Triggs, "Histograms of Oriented Gradients for Human Detection". IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Volume 1, (2005), pp. 886–893.

[8] Ojala, T., M. Pietikainen, and T. Maenpaa, "Multiresolution Gray-scale and Rotation Invariant Texture Classification With Local Binary Patterns". IEEE Transactions on Pattern Analysis and Machine Intelligence. Volume 24, No. 7 July 2002, pp. 971–987.

Extended Capabilities

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

Usage notes and limitations:

Generates portable C code using a C++ compiler that links to OpenCV (Version 3.4.0) libraries. See Portable C Code Generation for Functions That Use OpenCV Library.
See System Objects in MATLAB Code Generation (MATLAB Coder).
XMLFILE must be a compile-time constant.

Version History

Introduced in R2012a

vision.CascadeObjectDetector

Description

Creation

Syntax

Description

Properties

ClassificationModel — Trained cascade classification model 'FrontalFaceCART' (default) | character string

MinSize — Size of smallest detectable object [] (default) | two-element vector

MaxSize — Size of largest detectable object [] (default) | two-element vector

ScaleFactor — Scaling for multiscale object detection 1.1 (default) | scalar

MergeThreshold — Detection threshold 4 (default) | integer

UseROI — Use region of interest false (default) | true

Usage

Syntax

Description

Input Arguments

I — Input image grayscale | truecolor (RGB)

ds — Datastore datastore object

model — Classification model 'FrontalFaceCART' (default) | character string

XMLFILE — Custom classification model XML file

roi — Rectangular region of interest four-element vector (default)

Output Arguments

bbox — Detections M-by-4 matrix (default)

detectionResults — Detection results 3-column table

Object Functions

Common to All System Objects

Examples

Detect Faces in an Image Using the Frontal Face Classification Model

Detect Upper Body in Image Using Upper Body Classification Model

Algorithms

Classification Model Training

Cascade of Classifiers

Multiscale Object Detection

Relationship Between MinSize, MaxSize, and ScaleFactor

Merge Detection Threshold

References

Extended Capabilities

C/C++ Code Generation Generate C and C++ code using MATLAB® Coder™.

Version History

See Also

Topics

External Websites

`ClassificationModel` — Trained cascade classification model
`'FrontalFaceCART'` (default) | `character string`

`MinSize` — Size of smallest detectable object
`[]` (default) | two-element vector

`MaxSize` — Size of largest detectable object
`[]` (default) | two-element vector

`ScaleFactor` — Scaling for multiscale object detection
`1.1` (default) | scalar

`MergeThreshold` — Detection threshold
`4` (default) | integer

`UseROI` — Use region of interest
`false` (default) | `true`

`I` — Input image
grayscale | truecolor (RGB)

`ds` — Datastore
`datastore` object

`model` — Classification model
`'FrontalFaceCART'` (default) | `character string`

`XMLFILE` — Custom classification model
XML file

`roi` — Rectangular region of interest
four-element vector (default)

`bbox` — Detections
M-by-4 matrix (default)

`detectionResults` — Detection results
3-column table

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.