incrementalOneClassSVM
Oneclass support vector machine (SVM) model for incremental anomaly detection
Since R2023b
Description
The incrementalOneClassSVM
function creates an
incrementalOneClassSVM
model object, which represents a oneclass SVM model for
incremental anomaly detection.
Unlike other Statistics and Machine Learning Toolbox™ model objects, incrementalOneClassSVM
can be called directly. Also,
you can specify learning options, such as the minibatch size for each learning cycle, the
learning rate, and whether to standardize the predictor data before fitting the model to data.
After you create an incrementalOneClassSVM
object, it is prepared for incremental
learning (see Incremental Learning for Anomaly Detection).
incrementalOneClassSVM
is best suited for incremental learning. For a traditional
approach to anomaly detection when all the data is provided in advance, see ocsvm
.
Note
Incremental learning functions support only numeric input predictor data. You must
prepare an encoded version of categorical data to use incremental learning functions. Use
dummyvar
to convert each categorical variable
to a dummy variable. For more details, see Dummy Variables.
Creation
You can create an incrementalOneClassSVM
model object in several ways:
Call the function directly — Configure incremental learning options, or specify learnerspecific options, by calling
incrementalOneClassSVM
directly. This approach is best when you do not have data yet or you want to start incremental learning immediately.Convert a traditionally trained model — To initialize a oneclass SVM model for incremental learning using the model parameters of a trained
OneClassSVM
model object, you can convert the traditionally trained model to anincrementalOneClassSVM
model object by passing it to theincrementalLearner
function.Call an incremental learning function —
fit
accepts a configuredincrementalOneClassSVM
model object and data as input, and returns anincrementalOneClassSVM
model object updated with information learned from the input model and data.
Description
returns a
default incremental oneclass SVM model object Mdl
= incrementalOneClassSVMMdl
for anomaly
detection. Properties of a default model contain placeholders for unknown model
parameters. You must train a default model before you can use it to detect
anomalies.
sets properties and additional
options using namevalue arguments. For example,
Mdl
= incrementalOneClassSVM(Name=Value
)incrementalOneClassSVM(ContaminationFraction=0.1,ScoreWarmupPeriod=1000)
sets the anomaly contamination fraction to 0.1
and the score warmup
period to 1000
.
Input Arguments
Specify optional pairs of arguments as
Name1=Value1,...,NameN=ValueN
, where Name
is
the argument name and Value
is the corresponding value.
Namevalue arguments must appear after other arguments, but the order of the
pairs does not matter.
Example: Shuffle=true,StandardizeData=true
specifies to shuffle the
observations at each iteration, and to standardize the predictor data.
RandomStream
— Random number stream
global stream (default)  random stream object
Random number stream for reproducibility of data transformation, specified as a random stream object. For details, see Random Feature Expansion.
Use RandomStream
to reproduce the random basis functions used
by incrementalOneClassSVM
to transform the data in X
to
a highdimensional space. For details, see Managing the Global Stream Using RandStream and Creating and Controlling a Random Number Stream.
Example: RandomStream=RandStream("mlfg6331_64")
Shuffle
— Flag for shuffling observations
true
or 1
(default)  false
or 0
This property is readonly.
Flag for shuffling the observations at each iteration, specified as a value in this table.
Value  Description 

1 (true)  The software shuffles the observations in an incoming chunk of
data before the fit function fits the model. This
action reduces bias induced by the sampling scheme. 
0 (false)  The software processes the data in the order received. 
This option is valid only when Solver
is
"scaleinvariant"
. When Solver
is
"sgd"
or "asgd"
, the software always shuffles
the observations in an incoming chunk of data before processing the data.
Example: Shuffle=false
Data Types: logical
StandardizeData
— Flag to standardize predictor data
"auto"
(default)  true
or 1
 false
or 0
Flag to standardize the predictor data, specified as a value in this table.
Value  Description 

"auto"  incrementalOneClassSVM determines whether the predictor
variables need to be standardized. 
1 (true)  The software standardizes the predictor data. 
0 (false)  The software does not standardize the predictor data. 
Under some conditions, incrementalOneClassSVM
can override your
specification. For more details, see Standardize Data.
Example: StandardizeData=true
Data Types: logical
 char
 string
Properties
You can set most properties by using namevalue argument syntax when you call
incrementalOneClassSVM
directly. You can set some properties when you call
incrementalLearner
to convert a traditionally trained model object. You
cannot set the properties FittedLoss
, Mu
,
NumTrainingObservations
, ScoreThreshold
,
Sigma
, SolverOptions
, and
IsWarm
.
OneClass SVM Model Parameters
ContaminationFraction
— Fraction of anomalies in training data
numeric scalar in the range [0,1]
This property is readonly.
Fraction of anomalies in the training data, specified as a numeric scalar in the
range [0,1]
.
If the
ContaminationFraction
value is0
(default), thenincrementalOneClassSVM
treats all training observations as normal observations, and sets the score threshold (ScoreThreshold
property value) to the maximum anomaly score value of the training data.If the
ContaminationFraction
value is in the range (0
,1
], thenincrementalOneClassSVM
determines the threshold value (ScoreThreshold
property value) so that the function detects the specified fraction of training observations as anomalies.
The default ContaminationFraction
value depends on how you
create the model:
If you convert a traditionally trained model to create
Mdl
,ContaminationFraction
is specified by the corresponding property of the traditionally trained model.If you create
Mdl
by callingincrementalOneClassSVM
directly, you can specifyContaminationFraction
by using namevalue argument syntax. If you do not specify the value, then the default value is0
.
Data Types: single
 double
FittedLoss
— Loss function used to fit linear model
"hinge"
This property is readonly.
Loss function used to fit the linear model, returned as
"hinge"
. The function has the form $$\ell \left[y,f\left(x\right)\right]=\mathrm{max}\left[0,1yf\left(x\right)\right]$$.
KernelScale
— Kernel scale parameter
"auto"
 positive scalar
This property is readonly.
Kernel scale parameter, specified as "auto"
or a positive scalar.
incrementalOneClassSVM
stores the KernelScale
value
as a numeric scalar. The software obtains a random basis for feature expansion by using
the kernel scale parameter. For details, see Random Feature Expansion.
If you specify "auto"
when creating the model object, the software
selects an appropriate kernel scale parameter using a heuristic procedure. This
procedure uses subsampling, so estimates might vary from one call to another. Therefore,
to reproduce results, set a random number seed by using rng
before training.
The default KernelScale
value depends on how you create the model:
If you convert a traditionally trained model to create
Mdl
,KernelScale
is specified by the corresponding property of the traditionally trained model.Otherwise, the default value is
1
.
Data Types: char
 string
 single
 double
Mu
— Predictor means
numeric vector  []
This property is readonly.
Predictor means, specified as a numeric vector.
If Mu
is an empty array []
and you specify
StandardizeData=true
, the incremental fitting function fit
sets
Mu
to the predictor variable means estimated during the
estimation period specified by EstimationPeriod
.
You cannot specify Mu
directly.
Data Types: single
 double
NumExpansionDimensions
— Number of dimensions of expanded space
"auto"
(default)  positive integer
This property is readonly.
Number of dimensions of the expanded space, specified as "auto"
or a positive integer. incrementalOneClassSVM
stores the
NumExpansionDimensions
value as a numeric scalar.
For "auto"
, the software selects the number of dimensions using
2.^ceil(min(log2(p)+5,15))
, where p
is the
number of predictors. For details, see Random Feature Expansion.
The default NumExpansionDimensions
value depends on how you
create the model:
If you convert a traditionally trained model to create
Mdl
,NumExpansionDimensions
is specified by the corresponding property of the traditionally trained model.Otherwise, the default value is
"auto"
.
Data Types: char
 string
 single
 double
NumPredictors
— Number of predictor variables
nonnegative numeric scalar
This property is readonly.
Number of predictor variables, specified as a nonnegative numeric scalar.
The default NumPredictors
value depends on how you create the model:
If you convert a traditionally trained model to create
Mdl
,NumPredictors
is specified by the corresponding property of the traditionally trained model.If you create
Mdl
by callingincrementalOneClassSVM
directly, you can specifyNumPredictors
by using namevalue argument syntax. If you do not specify the value, then the default value is0
, and incremental fitting functions inferNumPredictors
from the predictor data during training.
Data Types: double
NumTrainingObservations
— Number of observations fit to incremental model
0
(default)  nonnegative numeric scalar
This property is readonly.
Number of observations fit to the incremental model Mdl
,
specified as a nonnegative numeric scalar. NumTrainingObservations
increases when you pass Mdl
and training data to
fit
outside of the estimation period.
When fitting the model, the software ignores observations that contain at least one missing value.
If you convert a traditionally trained model to create
Mdl
,incrementalOneClassSVM
does not add the number of observations fit to the traditionally trained model toNumTrainingObservations
.
You cannot specify NumTrainingObservations
directly.
Data Types: double
PredictorNames
— Predictor variable names
cell array of character vectors
This property is readonly.
Predictor variable names, specified as a cell array of character vectors. The
order of the elements in PredictorNames
corresponds to the order
in which the predictor names appear in the training data. If the training data is in a
table TBL
, the predictor variable names must be a subset of the
variable names in TBL
, and the fit
and
isanomaly
functions use only the selected variables. The software infers
NumPredictors
based on the value of
PredictorNames
.
Data Types: cell
Sigma
— Predictor standard deviations
numeric vector  []
This property is readonly.
Predictor standard deviations, specified as a numeric vector.
If Sigma
is an empty array []
and you
specify StandardizeData=true
, the incremental fitting function
fit
sets
Sigma
to the predictor variable standard deviations estimated
during the estimation period specified by EstimationPeriod
.
You cannot specify Sigma
directly.
Data Types: single
 double
SGD and ASGD Solver Parameters
BatchSize
— Minibatch size
10
(default)  positive integer
This property is readonly.
Minibatch size for the stochastic solvers, specified as a positive integer. You
cannot specify this parameter when Solver
is
"scaleinvariant"
.
At each learning cycle during training, incrementalOneClassSVM
uses
BatchSize
observations to compute the subgradient. The number of
observations for the last minibatch (last learning cycle in each function call of
fit
) can be smaller than
BatchSize
. For example, if you specify
BatchSize
= 10
and supply 25 observations to
fit
, the function uses 10 observations for the first two
learning cycles and 5 observations for the last learning cycle.
The default BatchSize
value depends on how you create the model:
If you convert a traditionally trained model to create
Mdl
, and you specifySolver
="sgd"
orSolver
="asgd"
, thenBatchSize
is specified by the corresponding property of the object.If you create
Mdl
by callingincrementalOneClassSVM
directly, the default value is10
.
Data Types: single
 double
Lambda
— Ridge (L2) regularization term strength
"auto"
 nonnegative scalar
This property is readonly.
Ridge (L2) regularization term strength, specified as a nonnegative scalar.
You cannot specify this parameter unless you specify Solver
=
"sgd"
or Solver
=
"asgd"
.
If you specify "auto"
when creating the model object:
When
Solver
is"sgd"
or"asgd"
, the software estimatesLambda
during the estimation period using a heuristic procedure.When
Solver
is"scaleinvariant"
, thenLambda
=NaN
.
The default Lambda
value depends on how you create the model:
If you convert a traditionally trained model to create
Mdl
, and you specifySolver
="sgd"
orSolver
="asgd"
, thenLambda
is specified by the corresponding property of the traditionally trained model. If you specify a different solver, the default value isNaN
.If you create
Mdl
by callingincrementalOneClassSVM
directly, the default value isNaN
.
Data Types: double
 single
LearnRate
— Initial learning rate
"auto"
 positive scalar
This property is readonly.
Initial learning rate, specified as "auto"
or a positive
scalar. incrementalOneClassSVM
stores the LearnRate
value
as a numeric scalar.
You cannot specify this parameter when Solver
is
"scaleinvariant"
.
The learning rate controls the optimization step size by scaling the objective
subgradient. LearnRate
specifies an initial value for the
learning rate, and LearnRateSchedule
determines the learning rate for subsequent learning cycles.
When you specify "auto"
:
The initial learning rate is
0.7
.If
EstimationPeriod
>0
,fit
changes the rate to1/sqrt(1+max(sum(X.^2,2)))
at the end ofEstimationPeriod
, whereX
is the predictor data collected during the estimation period.
The default LearnRate
value depends on how you create the model:
If you create
Mdl
by callingincrementalOneClassSVM
directly, the default value is"auto"
.Otherwise, the
LearnRate
namevalue argument of theincrementalLearner
function sets this property. The default value of the argument is"auto"
.
Data Types: single
 double
 char
 string
LearnRateSchedule
— Learning rate schedule
"decaying"
 "constant"
This property is readonly.
Learning rate schedule, specified as "decaying"
or
"constant"
, where LearnRate
specifies
the initial learning rate ɣ_{0}.
incrementalOneClassSVM
stores the LearnRateSchedule
value as a character vector.
You cannot specify this parameter unless you specify Solver
=
"sgd"
or Solver
=
"asgd"
.
Value  Description 

"constant"  The learning rate is ɣ_{0} for all learning cycles. 
"decaying"  The learning rate at learning cycle t is $${\gamma}_{t}=\frac{{\gamma}_{0}}{{\left(1+\lambda {\gamma}_{0}t\right)}^{c}}.$$

The default LearnRateSchedule
value depends on how you create
the model:
If you convert a traditionally trained model object to create
Mdl
, theLearnRateSchedule
namevalue argument of theincrementalLearner
function sets this property.Otherwise, the default value is
"decaying"
.
Data Types: char
 string
Score Threshold Parameters
IsWarm
— Flag indicating whether fit
returns
scores and detects anomalies
false
or 0
 true
or 1
fit
This property is readonly.
Flag indicating whether the incremental fitting function fit
returns
scores and detects anomalies after training the model, specified as logical
0
(false
) or 1
(true
).
The incremental model Mdl
is warm
(IsWarm
becomes true
) after the
fit
function fits the incremental model to
ScoreWarmupPeriod
observations.
You cannot specify IsWarm
directly.
If EstimationPeriod
>
0
, then during the estimation period, fit
does not fit the model or update ScoreThreshold
, and
IsWarm
is false
.
Value  Description 

1(true)  The incremental model Mdl is warm. Consequently,
fit trains the model and then returns scores and
detects anomalies. 
0(false)  fit trains the model but returns all scores as
NaN and all anomaly indicators as
false . 
Data Types: logical
ScoreThreshold
— Threshold for anomaly score
nonnegative integer
This property is readonly.
Threshold for the anomaly score used to detect anomalies, specified as a
nonnegative integer. incrementalOneClassSVM
detects observations with scores
above the threshold as anomalies.
Note
If EstimationPeriod
>
0
, then during the estimation period,
fit
does not fit the model or update
ScoreThreshold
.
The default ScoreThreshold
value depends on how you create
the model:
If you convert a traditionally trained model object to create
Mdl
, thenScoreThreshold
is specified by the corresponding property value of the object.Otherwise, the default value is
0
.
You cannot specify ScoreThreshold
directly.
Data Types: single
 double
ScoreWarmupPeriod
— Warmup period before score output and anomaly detection
nonnegative integer
This property is readonly.
Warmup period before score output and anomaly detection (outside the estimation
period, if EstimationPeriod
> 0
), specified as
a nonnegative integer. The ScoreWarmupPeriod
value is the number
of observations to which the incremental model must be fit before the incremental
fit
function returns scores and detects anomalies.
Note
When processing observations during the score warmup period, the software ignores observations that contain at least one missing value.
You can return scores and detect anomalies during the warmup period by calling
isanomaly
directly.
The default ScoreWarmupPeriod
value depends on how you create
the model:
If you convert a traditionally trained model to create
Mdl
, theScoreWarmupPeriod
namevalue argument of theincrementalLearner
function sets this property.Otherwise, the default value is
0
.
Data Types: single
 double
ScoreWindowSize
— Running window size used to estimate score threshold
positive integer
This property is readonly.
Running window size used to estimate the score threshold
(ScoreThreshold
), specified as a positive integer.
The default ScoreWindowSize
value depends on how you create
the model:
If you convert a traditionally trained model to create
Mdl
, theScoreWindowSize
namevalue argument of theincrementalLearner
function sets this property.Otherwise, the default value is
1000
.
Data Types: double
Training Parameters
EstimationPeriod
— Number of observations processed to estimate hyperparameters
nonnegative integer
This property is readonly.
Number of observations processed by the incremental learner to estimate hyperparameters before training, specified as a nonnegative integer.
When processing observations during the estimation period, the software ignores observations that contain at least one missing value.
If
Mdl
is prepared for incremental learning (all hyperparameters required for training are specified),incrementalOneClassSVM
forcesEstimationPeriod
to0
.If
Mdl
is not prepared for incremental learning,incrementalOneClassSVM
setsEstimationPeriod
to1000
and estimates the unknown hyperparameters.
For more details, see Estimation Period.
Data Types: single
 double
Solver
— Objective function minimization technique
"scaleinvariant"
(default)  "sgd"
 "asgd"
This property is readonly.
Objective function minimization technique, specified as a value in this table.
Value  Description  Notes 

"scaleinvariant"  Adaptive scaleinvariant solver for incremental learning [1] 

"sgd"  Stochastic gradient descent (SGD) [5][2] 

"asgd"  Average stochastic gradient descent (ASGD) [6] 

Data Types: char
 string
SolverOptions
— Objective solver configurations
structure array
This property is readonly.
Objective solver configurations, specified as a structure array. The fields of
SolverOptions
depend on the value of Solver
.
You can specify the field values using the corresponding namevalue arguments when
you create the model object by calling incrementalOneClassSVM
directly, or when
you convert a traditionally trained model using the
incrementalLearner
function.
You cannot specify SolverOptions
directly.
Data Types: struct
Object Functions
Examples
Create Incremental Anomaly Detector Without Any Prior Information
Create a default oneclass support vector machine (SVM) model for incremental anomaly detection.
Mdl = incrementalOneClassSVM; Mdl.ScoreWarmupPeriod
ans = 0
Mdl.ContaminationFraction
ans = 0
Mdl
is an incrementalOneClassSVM
model object. All its properties are readonly. By default, the software sets the score warmup period to 0 and the anomaly contamination fraction to 0.
Mdl
must be fit to data before you can use it to perform any other operations.
Load Data
Load the 1994 census data stored in census1994.mat
. The data set consists of demographic data from the US Census Bureau.
load census1994.mat
incrementalOneClassSVM
does not support categorical predictors and does not use observations with missing values. Remove missing values in the data to reduce memory consumption and speed up training. Remove the categorical predictors.
adultdata = rmmissing(adultdata); adultdata = removevars(adultdata,["workClass","education","marital_status", ... "occupation","relationship","race","sex","native_country","salary"]);
Fit Incremental Model
Fit the incremental model Mdl
to the data in the adultdata
table by using the fit
function. Because ScoreWarmupPeriod
= 0
, fit
returns scores and detects anomalies immediately after fitting the model for the first time. To simulate a data stream, fit the model in chunks of 100 observations at a time. At each iteration:
Process 100 observations.
Overwrite the previous incremental model with a new one fitted to the incoming observations.
Store
medianscore
, the median score value of the data chunk, to see how it evolves during incremental learning.Store
allscores
, the score values for the fitted observations.Store
threshold
, the score threshold value for anomalies, to see how it evolves during incremental learning.Store
numAnom
, the number of detected anomalies in the data chunk.
n = numel(adultdata(:,1)); numObsPerChunk = 100; nchunk = floor(n/numObsPerChunk); medianscore = zeros(nchunk,1); threshold = zeros(nchunk,1); numAnom = zeros(nchunk,1); allscores = []; % Incremental fitting rng(0,"twister"); % For reproducibility for j = 1:nchunk ibegin = min(n,numObsPerChunk*(j1) + 1); iend = min(n,numObsPerChunk*j); idx = ibegin:iend; Mdl = fit(Mdl,adultdata(idx,:)); [isanom,scores] = isanomaly(Mdl,adultdata(idx,:)); medianscore(j) = median(scores); allscores = [allscores scores']; numAnom(j) = sum(isanom); threshold(j) = Mdl.ScoreThreshold; end
Mdl
is an incrementalOneClassSVM
model object trained on all the data in the stream. The fit
function fits the model to the data chunk, and the isanomaly
function returns the observation scores and the indices of observations in the data chunk with scores above the score threshold value.
Analyze Incremental Model During Training
Plot the anomaly score for every observation.
plot(allscores,".") xlabel("Observation") ylabel("Score") xlim([0 n])
At each iteration, the software calculates a score value for each observation in the data chunk. A negative score value with large magnitude indicates a normal observation, and a large positive value indicates an anomaly.
To see how the score threshold and median score per data chunk evolve during training, plot them on separate tiles.
figure tiledlayout(2,1); nexttile plot(medianscore,".") ylabel("Median Score") xlabel("Iteration") xlim([0 nchunk]) nexttile plot(threshold,".") ylabel("Score Threshold") xlabel("Iteration") xlim([0 nchunk])
finalScoreThreshold=Mdl.ScoreThreshold
finalScoreThreshold = 0.1799
The median score is negative for the first several iterations, then rapidly approaches zero. The anomaly score threshold immediately rises from its (default) starting value of 0 to 1.3, and then gradually approaches 0.18. Because ContaminationFraction
= 0, incrementalOneClassSVM
treats all training observations as normal observations, and at each iteration sets the score threshold to the maximum score value in the data chunk.
totalAnomalies = sum(numAnom)
totalAnomalies = 0
No anomalies are detected at any iteration, because ContaminationFraction
= 0.
Configure Incremental Learning Options and Analyze Model During Training
Prepare an incremental oneclass SVM model by specifying an anomaly contamination fraction of 0.001, and standardize the data using an initial estimation period of 2000 observations. Specify a score warmup period of 10,000 observations, during which the fit
function updates the score threshold and trains the model but does not return scores or detect anomalies.
Mdl = incrementalOneClassSVM(ContaminationFraction=0.001, ... StandardizeData=true,EstimationPeriod=2000, ... ScoreWarmupPeriod=10000);
Mdl
is an incrementalOneClassSVM
model object. All its properties are readonly. Mdl
must be fit to data before you can use it to perform any other operations.
Load Data
Load the 1994 census data stored in census1994.mat
. The data set consists of demographic data from the US Census Bureau.
load census1994.mat
incrementalOneClassSVM
does not support categorical predictors and does not use observations with missing values. Remove missing values in the data to reduce memory consumption and speed up training. Remove the categorical predictors.
adultdata = rmmissing(adultdata); Xtrain = removevars(adultdata,["workClass","education","marital_status", ... "occupation","relationship","race","sex","native_country","salary"]);
Fit Incremental Model and Detect Anomalies
Fit the incremental model Mdl
to the data by using the fit
function. To simulate a data stream, fit the model in chunks of 100 observations at a time. Because EstimationPeriod
= 2000
and ScoreWarmupPeriod
= 10000
, fit
returns scores and detects anomalies only after 120 iterations. At each iteration:
Process 100 observations.
Overwrite the previous incremental model with a new one fitted to the incoming observations.
Store
meanscore
, the mean score value of the data chunk, to see how it evolves during incremental learning.Store
threshold
, the score threshold value for anomalies, to see how it evolves during incremental learning.Store
numAnom
, the number of detected anomalies in the chunk, to see how it evolves during incremental learning.
n = numel(Xtrain(:,1)); numObsPerChunk = 100; nchunk = floor(n/numObsPerChunk); meanscore = zeros(nchunk,1); threshold = zeros(nchunk,1); numAnom = zeros(nchunk,1); % Incremental fitting rng("default"); % For reproducibility for j = 1:nchunk ibegin = min(n,numObsPerChunk*(j1) + 1); iend = min(n,numObsPerChunk*j); idx = ibegin:iend; [Mdl,tf,scores] = fit(Mdl,Xtrain(idx,:)); meanscore(j) = mean(scores); numAnom(j) = sum(tf); threshold(j) = Mdl.ScoreThreshold; end
Mdl
is an incrementalOneClassSVM
model object trained on all the data in the stream.
Analyze Incremental Model During Training
To see how the mean score, score threshold, and number of detected anomalies per chunk evolve during training, plot them on separate tiles.
tiledlayout(3,1); nexttile plot(meanscore) ylabel("Mean Score") xlabel("Iteration") xlim([0 nchunk]) xline(Mdl.EstimationPeriod/numObsPerChunk,"r.") xline((Mdl.EstimationPeriod+Mdl.ScoreWarmupPeriod)/numObsPerChunk,"r") nexttile plot(threshold) ylabel("Score Threshold") xlabel("Iteration") xlim([0 nchunk]) xline(Mdl.EstimationPeriod/numObsPerChunk,"r.") xline((Mdl.EstimationPeriod+Mdl.ScoreWarmupPeriod)/numObsPerChunk,"r") nexttile plot(numAnom,"+") ylabel("Anomalies") xlabel("Iteration") xlim([0 nchunk]) ylim([0 max(numAnom)+0.2]) xline(Mdl.EstimationPeriod/numObsPerChunk,"r.") xline((Mdl.EstimationPeriod+Mdl.ScoreWarmupPeriod)/numObsPerChunk,"r")
During the estimation period, fit
estimates means and standard deviations using the observations, and does not fit the model or update the score threshold. During the warmup period, the fit
function fits the model and updates the score threshold, but returns all scores as NaN
and all anomaly values as false
. After the warmup period, fit
returns the observation scores and the indices of observations with scores above the score threshold value. A negative score value with large magnitude indicates a normal observation, and a large positive value indicates an anomaly.
totalAnomalies=sum(numAnom)
totalAnomalies = 18
anomfrac= totalAnomalies/(nMdl.EstimationPeriodMdl.ScoreWarmupPeriod)
anomfrac = 9.9108e04
The software detected 18 anomalies after the warmup and estimation periods. The contamination fraction after the estimation and warmup periods is approximately 0.001.
More About
Incremental Learning for Anomaly Detection
Incremental learning, or online learning, is a branch of machine learning concerned with processing incoming data from a data stream, possibly given little to no knowledge of the distribution of the predictor variables, aspects of the prediction or objective function (including tuning parameter values), or whether the observations contain anomalies. Incremental learning differs from traditional machine learning, where enough data is available to fit to a model, perform crossvalidation to tune hyperparameters, and infer the predictor distribution.
Anomaly detection is used to identify unexpected events and departures from normal behavior. In situations where the full data set is not immediately available, or new data is arriving, you can use incremental learning for anomaly detection to incrementally train a model so it adjusts to the characteristics of the incoming data.
Given incoming observations, an incremental learning model for anomaly detection does the following:
Computes anomaly scores
Updates the anomaly score threshold
Detects data points above the score threshold as anomalies
Fits the model to the incoming observations
For more information, see Incremental Anomaly Detection with MATLAB.
Adaptive ScaleInvariant Solver for Incremental Learning
The adaptive scaleinvariant solver for incremental learning, introduced in [1], is a gradientdescentbased objective solver for training linear predictive models. The solver is hyperparameter free, insensitive to differences in predictor variable scales, and does not require prior knowledge of the distribution of the predictor variables. These characteristics make it well suited to incremental learning.
The incremental fitting function fit
uses the more
aggressive ScInOL2 version of the algorithm to train binary learners. The function always
shuffles an incoming batch of data before fitting the model.
Random Feature Expansion
Random feature expansion, such as Random Kitchen Sinks[4] or Fastfood[3], is a scheme to approximate Gaussian kernels of the kernel classification algorithm to use for big data in a computationally efficient way. Random feature expansion is more practical for big data applications that have large training sets, but can also be applied to smaller data sets that fit in memory.
The kernel classification algorithm searches for an optimal hyperplane that separates the data into two classes after mapping features into a highdimensional space. Nonlinear features that are not linearly separable in a lowdimensional space can be separable in the expanded highdimensional space. All the calculations for hyperplane classification use only dot products. You can obtain a nonlinear classification model by replacing the dot product x_{1}x_{2}' with the nonlinear kernel function $$G({x}_{1},{x}_{2})=\langle \phi ({x}_{1}),\phi ({x}_{2})\rangle $$, where x_{i} is the ith observation (row vector) and φ(x_{i}) is a transformation that maps x_{i} to a highdimensional space (called the “kernel trick”). However, evaluating G(x_{1},x_{2}) (Gram matrix) for each pair of observations is computationally expensive for a large data set (large n).
The random feature expansion scheme finds a random transformation so that its dot product approximates the Gaussian kernel. That is,
$$G({x}_{1},{x}_{2})=\langle \phi ({x}_{1}),\phi ({x}_{2})\rangle \approx T({x}_{1})T({x}_{2})\text{'},$$
where T(x) maps x in $${\mathbb{R}}^{p}$$ to a highdimensional space ($${\mathbb{R}}^{m}$$). The Random Kitchen Sinks scheme uses the random transformation
$$T(x)={m}^{1/2}\mathrm{exp}\left(iZx\text{'}\right)\text{'},$$
where $$Z\in {\mathbb{R}}^{m\times p}$$ is a sample drawn from $$N\left(0,{\sigma}^{2}\right)$$ and σ is a kernel scale. This scheme requires O(mp) computation and storage.
The Fastfood scheme introduces another random
basis V instead of Z using Hadamard matrices combined
with Gaussian scaling matrices. This random basis reduces the computation cost to O(mlog
p) and reduces storage to O(m).
You can specify values for m and
σ by setting NumExpansionDimensions
and
KernelScale
, respectively, of incrementalOneClassSVM
.
The incrementalOneClassSVM
function uses the
Fastfood scheme for random feature expansion, and uses linear classification to train a
oneclass Gaussian kernel classification model.
Algorithms
Estimation Period
During the estimation period, the incremental fitting function fit
uses the
first incoming EstimationPeriod
observations to estimate (tune)
hyperparameters required for incremental training. Estimation occurs only when
EstimationPeriod
is positive. This table describes the
hyperparameters and when they are estimated, or tuned.
Hyperparameter  Model Property  Usage  Conditions 

Learning rate  LearnRate field of SolverOptions  Adjust the solver step size  The hyperparameter is estimated when both of these conditions apply:

Kernel scale parameter  KernelScale  Set a kernel scale parameter value for random feature expansion  The hyperparameter is estimated when you set the
KernelScale to "auto" . 
During the estimation period,
fit
does not fit the model. At the end of the estimation period,
the function updates the properties that store the hyperparameters.
Standardize Data
If incremental learning functions are configured to standardize predictor variables,
they do so using the means and standard deviations stored in the Mu
and
Sigma
properties of the incremental learning model
Mdl
.
When you set
StandardizeData=true
and a positive estimation period (seeEstimationPeriod
), andMdl.Mu
andMdl.Sigma
are empty, the incremental fit function estimates means and standard deviations using the estimation period observations.When you set
StandardizeData="auto"
(the default), the following conditions apply:If you create
incrementalOneClassSVM
by converting a traditionally trained oneclass SVM model (OneClassSVM
), and theMu
andSigma
properties of the model being converted are empty arrays[]
, incremental learning functions do not standardize predictor variables. If theMu
andSigma
properties of the model being converted are nonempty, incremental learning functions standardize the predictor variables using the specified means and standard deviations. The incremental fitting function does not estimate new means and standard deviations, regardless of the length of the estimation period.If you do not convert a traditionally trained model, the incremental fitting function standardizes the predictor variables only when you specify an SGD solver (see
Solver
) and a positive estimation period (seeEstimationPeriod
).
When the incremental fitting function estimates predictor means and standard deviations, the function computes weighted means and weighted standard deviations using the estimation period observations. Specifically, the function standardizes predictor j (x_{j}) using
$${x}_{j}^{\ast}=\frac{{x}_{j}{\mu}_{j}^{\ast}}{{\sigma}_{j}^{\ast}}.$$
x_{j} is predictor j, and x_{jk} is observation k of predictor j in the estimation period.
$${\mu}_{j}^{\ast}=\frac{1}{{\displaystyle \sum _{k}{w}_{k}}}{\displaystyle \sum _{k}{w}_{k}{x}_{jk}}.$$
$${\left({\sigma}_{j}^{\ast}\right)}^{2}=\frac{1}{{\displaystyle \sum _{k}{w}_{k}}}{\displaystyle \sum _{k}{w}_{k}{\left({x}_{jk}{\mu}_{j}^{\ast}\right)}^{2}}.$$
w_{j} is observation weight j.
References
[1] Kempka, Michał, Wojciech Kotłowski, and Manfred K. Warmuth. "Adaptive ScaleInvariant Online Algorithms for Learning Linear Models." Preprint, submitted February 10, 2019. https://arxiv.org/abs/1902.07528.
[2] Langford, J., L. Li, and T. Zhang. “Sparse Online Learning Via Truncated Gradient.” J. Mach. Learn. Res., Vol. 10, 2009, pp. 777–801.
[3] Le, Q., T. Sarlós, and A. Smola. “Fastfood — Approximating Kernel Expansions in Loglinear Time.” Proceedings of the 30th International Conference on Machine Learning. Vol. 28, No. 3, 2013, pp. 244–252.
[4] Rahimi, A., and B. Recht. “Random Features for LargeScale Kernel Machines.” Advances in Neural Information Processing Systems. Vol. 20, 2008, pp. 1177–1184.
[5] ShalevShwartz, S., Y. Singer, and N. Srebro. “Pegasos: Primal Estimated SubGradient Solver for SVM.” Proceedings of the 24th International Conference on Machine Learning, ICML ’07, 2007, pp. 807–814.
[6] Xu, Wei. “Towards Optimal One Pass Large Scale Learning with Averaged Stochastic Gradient Descent.” CoRR, abs/1107.2490, 2011.
Version History
Introduced in R2023b
Open Example
You have a modified version of this example. Do you want to open this example with your edits?
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list:
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
 América Latina (Español)
 Canada (English)
 United States (English)
Europe
 Belgium (English)
 Denmark (English)
 Deutschland (Deutsch)
 España (Español)
 Finland (English)
 France (Français)
 Ireland (English)
 Italia (Italiano)
 Luxembourg (English)
 Netherlands (English)
 Norway (English)
 Österreich (Deutsch)
 Portugal (English)
 Sweden (English)
 Switzerland
 United Kingdom (English)