# bj

Estimate Box-Jenkins polynomial model using time-domain data

## Syntax

``````sys = bj(tt,[nb nc nd nf nk])``````
``sys = bj(u,y,[nb nc nd nf nk])``
``sys = bj(data,[nb nc nd nf nk])``
``sys = bj(___, Name,Value)``
``sys = bj(tt, init_sys)``
``sys = bj(u,y,init_sys)``
``sys = bj(u,y,init_sys)``
``sys = bj(___, opt)``
``[sys,ic] = bj(___)``

## Description

Box-Jenkins (BJ) models are a special configuration of polynomial models that provide completely independent parameterization for the dynamics and noise using rational polynomial functions. BJ models, which are always discrete-time models, can be estimated only from time-domain data. Use BJ models when the noise is primarily a measurement disturbance rather than an input disturbance. The BJ structure provides additional flexibility for modeling the noise.

### Estimate Box-Jenkins Model

``````sys = bj(tt,[nb nc nd nf nk])``` estimates a Box-Jenkins polynomial model `sys` using the data contained in the variables of timetable `tt`. The software uses the first Nu variables as inputs and the next Ny variables as outputs, where Nu and Ny are determined from the dimensions of the specified polynomial orders. `sys` is represented by the equation $y\left(t\right)=\sum _{i=1}^{nu}\frac{{B}_{i}\left(q\right)}{{F}_{i}\left(q\right)}{u}_{i}\left(t-n{k}_{i}\right)+\frac{C\left(q\right)}{D\left(q\right)}e\left(t\right)$Here, y(t) is the output, u(t) is the input, and e(t) is the error. The components of `[nb nc nd nf nk]` define the orders of the polynomials used for estimation. For more information about the Box-Jenkins model structure, see Box-Jenkins Model Structure.To select specific input and output channels from `tt`, use name-value syntax to set `'InputName'` and `'OutputName'` to the corresponding timetable variable names.```
````sys = bj(u,y,[nb nc nd nf nk])` uses the time-domain input and output signals in the comma-separated matrices `u`,`y`. The software assumes that the data sample time is 1 second. To change the sample time, set `Ts` using name-value syntax.```

example

````sys = bj(data,[nb nc nd nf nk])` uses the time-domain data in the `iddata` object `data`. Use this syntax especially when you want to take advantage of the additional information, such as data sample time or experiment labeling, that data objects provide.```
````sys = bj(___, Name,Value)` specifies model structure attributes using additional options specified by one or more name-value arguments. You can use this syntax with any of the previous input-argument combinations.```

### Configure Initial Parameters

````sys = bj(tt, init_sys)` uses the polynomial model `init_sys` to configure the initial parameterization of `sys` for estimation using the timetable `tt`.```
````sys = bj(u,y,init_sys)` uses the matrix data `u`,`y` for estimation.```
````sys = bj(u,y,init_sys)` uses the data object `data`, for estimation.```

### Specify Additional Estimation Options

example

````sys = bj(___, opt)` incorporates an option set `opt` that specifies options such as handling of initial conditions, regularization, and numerical search method to use for estimation.```

### Return Estimated Initial Conditions

````[sys,ic] = bj(___)` returns the estimated initial conditions as an `initialCondition` object. Use this syntax if you plan to simulate or predict the model response using the same estimation input data and then compare the response with the same estimation output data. Incorporating the initial conditions yields a better match during the first part of the simulation.```

## Examples

collapse all

Estimate the parameters of a single-input, single-output Box-Jenkins model from measured data.

```load iddata1 z1; nb = 2; nc = 2; nd = 2; nf = 2; nk = 1; sys = bj(z1,[nb nc nd nf nk]);```

`sys` is a discrete-time `idpoly` model with estimated coefficients. The order of sys is as described by `nb`, `nc`, `nd`, `nf`, and `nk`.

Use `getpvec` to obtain the estimated parameters and `getcov` to obtain the covariance associated with the estimated parameters.

Estimate the parameters of a multi-input, single-output Box-Jenkins model from measured data.

```load iddata8 nb = [2 1 1]; nc = 1; nd = 1; nf = [2 1 2]; nk = [5 10 15]; sys = bj(z8,[nb nc nd nf nk]);```

`sys` estimates the parameters of a model with three inputs and one output. Each of the inputs has a delay associated with it.

Estimate a regularized BJ model by converting a regularized ARX model.

`load regularizationExampleData.mat m0simdata;`

Estimate an unregularized BJ model of order 30.

`m1 = bj(m0simdata(1:150),[15 15 15 15 1]);`

Estimate a regularized BJ model by determining Lambda value by trial and error.

```opt = bjOptions; opt.Regularization.Lambda = 1; m2 = bj(m0simdata(1:150),[15 15 15 15 1],opt);```

Obtain a lower-order BJ model by converting a regularized ARX model followed by order reduction.

```opt1 = arxOptions; [L,R] = arxRegul(m0simdata(1:150),[30 30 1]); opt1.Regularization.Lambda = L; opt1.Regularization.R = R; m0 = arx(m0simdata(1:150),[30 30 1],opt1); mr = idpoly(balred(idss(m0),7));```

Compare the model outputs against data.

```opt2 = compareOptions('InitialCondition','z'); compare(m0simdata(150:end),m1,m2,mr,opt2);```

Estimate the parameters of a single-input, single-output Box-Jenkins model while configuring some estimation options.

Generate estimation data.

```B = [0 1 0.5]; C = [1 -1 0.2]; D = [1 1.5 0.7]; F = [1 -1.5 0.7]; sys0 = idpoly(1,B,C,D,F,0.1); e = iddata([],randn(200,1)); u = iddata([],idinput(200)); y = sim(sys0,[u e]); data = [y u];```

`data` is a single-input, single-output data set created by simulating a known model.

Estimate initial Box-Jenkins model.

```nb = 2; nc = 2; nd = 2; nf = 2; nk = 1; init_sys = bj(data,[2 2 2 2 1]);```

Create an estimation option set to refine the parameters of the estimated model.

```opt = bjOptions; opt.Display = 'on'; opt.SearchOptions.MaxIterations = 50;```

`opt` is an estimation option set that configures the estimation to iterate 50 times at most and display the estimation progress.

Reestimate the model parameters using the estimation option set.

`sys = bj(data,init_sys,opt);`

`sys` is estimated using `init_sys` for the initial parameterization for the polynomial coefficients.

To view the estimation result, enter `sys.Report`.

Estimate a multi-input, multi-output Box-Jenkins model from estimated data.

```load iddata1 z1 load iddata2 z2 data = [z1 z2(1:300)];```

`data` contains the measured data for two inputs and two outputs.

Estimate the model.

``` nb = [2 2; 3 4]; nc = [2;2]; nd = [2;2]; nf = [1 0; 2 2]; nk = [1 1; 0 0]; sys = bj(data,[nb nc nd nf nk]);```

The polynomial order coefficients contain one row for each output.

`sys` is a discrete-time `idpoly` model with two inputs and two outputs.

`load iddata1ic z1i`

Estimate a second-order Box-Jenkins model `sys` and return the initial conditions in `ic`.

```nb = 2; nc = 2; nd = 2; nf = 2; nk = 1; [sys,ic] = bj(z1i,[nb nc nd nf nk]); ic```
```ic = initialCondition with properties: A: [4x4 double] X0: [4x1 double] C: [0.8744 0.5426 0.4647 -0.5285] Ts: 0.1000 ```

`ic` is an `initialCondition` object that encapsulates the free response of `sys`, in state-space form, to the initial state vector in `X0`. You can incorporate `ic` when you simulate `sys` with the `z1i` input signal and compare the response with the `z1i` output signal.

## Input Arguments

collapse all

Estimation data, specified as a `timetable` that uses a regularly spaced time vector. `tt` contains variables representing input and output channels. For multiexperiment data, `tt` is a cell array of timetables of length `Ne`, where `Ne` is the number of experiments

The software determines the number of input and output channels to use for estimation from the dimensions of the specified polynomial orders. The input/output channel selection depends on whether the `'InputName'` and `'OutputName'` name-value arguments are specified.

• If `'InputName'` and `'OutputName'` are not specified, then the software uses the first Nu variables of `tt` as inputs and the next Ny variables of `tt` as outputs.

• If `'InputName'` and `'OutputName'` are specified, then the software uses the specified variables. The number of specified input and output names must be consistent with Nu and Ny.

• For functions that can estimate a time series model, where there are no inputs, `'InputName'` does not need to be specified.

For more information about working with estimation data types, see Data Types in System Identification Toolbox.

Estimation data, specified for SISO systems as a comma-separated pair of Ns-by-1 real-valued matrices that contain uniformly sampled input and output time-domain signal values. Here, Ns is the number of samples.

For MIMO systems, specify `u`,`y` as an input/output matrix pair with the following dimensions:

• `u`Ns-by-Nu, where Nu is the number of inputs.

• `y`Ns-by-Ny, where Ny is the number of outputs.

For multiexperiment data, specify `u`,`y` as a pair of 1-by-Ne cell arrays, where Ne is the number of experiments. The sample times of all the experiments must match.

For time series data, which contains only outputs and no inputs, specify `y` only.

For more information about working with estimation data types, see Data Types in System Identification Toolbox.

Estimation data, specified as an `iddata` object that contains time-domain input and output signal values.

You cannot use frequency-domain data for estimating bj models.

Vector of matrices with nonnegative integers that contain the orders and delays of the Box-Jenkins model, as the following list describes.

• `nb` — Order of the B polynomial + 1, specified as an Ny-by-Nu matrix, where Ny is the number of outputs and Nu is the number of inputs.

• `nc` — Order of the C polynomial + 1, specified as an Ny-by-1 matrix.

• `nd` — Order of the D polynomial + 1, specified as an Ny-by-1 matrix.

• `nf` — Order of the F polynomial + 1, specified as an Ny-by-Nu matrix.

• `nk` — Input delay in units of samples, specified as an Nu-by-Ny matrix.

Polynomial model that configures the initial parameterization of `sys`, specified as an `idpoly` model with the Box-Jenkins structure that has only B, C, D and F polynomials active.

`bj` uses the parameters and constraints defined in `init_sys` as the initial guess for estimating `sys`.

Use the `Structure` property of `init_sys` to configure initial guesses and constraints for B(q), F(q), C(q) and D(q).

To specify an initial guess for, say, the C(q) term of `init_sys`, set `init_sys.Structure.C.Value` to the guess value.

To specify constraints for, say, the B(q) term of `init_sys`:

• set `init_sys.Structure.B.Minimum` to the minimum B(q) coefficient values

• set `init_sys.Structure.B.Maximum` to the maximum B(q) coefficient values

• set `init_sys.Structure.B.Free` to indicate which B(q) coefficients are free for estimation

You can similarly specify the initial guess and constraints for the other polynomials.

Estimation options, specified as an `bjOptions` option set. Options specified by `opt` include:

• Estimation objective

• Handling of initial conditions

• Numerical search method and the associated options

### Name-Value Arguments

Specify optional pairs of arguments as `Name1=Value1,...,NameN=ValueN`, where `Name` is the argument name and `Value` is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose `Name` in quotes.

Example: `'InputDelay',1`

Input channel names for timetable data, specified as a string, a character vector, or an array or cell array of strings or character vectors. By default, the software interprets all but the last variable in `tt` as input channels. When you want to select a subset of the timetable variables as input channels use `'InputName'` to identify them. For example, ```sys = bj(tt,__,'InputName',["u1" "u2"])``` selects the variables `u1` and `u2` as the input channels for the estimation.

Output channel names for timetable data, specified as a string, a character vector, or an array or cell array of strings or character vectors. By default, the software interprets the last variable in `tt` as the sole output channel. When you want to select a subset of the timetable variables as output channels, use `'OutputName'` to identify them. For example, ```sys = bj(tt,__,'OutputName',["y1" "y3"])``` selects the variables `y1` and `y3` as the output channels for the estimation.

Sample time, specified as the comma-separated pair consisting of `'Ts'` and the sample time in seconds. When you use matrix-based data (`u`,`y`), you must specify `'Ts'` if you require a sample time other than the assumed sample time of 1 second.

To obtain the data sample time for a timetable `tt`, use the timetable property `tt.Properties.Timestep`.

Example: `bj(umat1,ymat1,___,'Ts',0.08)` computes a model with sample time of 0.08 seconds.

Input delays for each input channel, specified as the comma-separated pair consisting of `'InputDelay'` and a numeric vector.

• For continuous-time models, specify `'InputDelay'` in the time units stored in the `TimeUnit` property.

• For discrete-time models, specify `'InputDelay'` in integer multiples of the sample time `Ts`. For example, setting `'InputDelay'` to `3` specifies a delay of three sampling periods.

For a system with Nu inputs, set `InputDelay` to an Nu-by-1 vector. Each entry of this vector is a numerical value that represents the input delay for the corresponding input channel.

To apply the same delay to all channels, specify `'InputDelay'` as a scalar.

For an example, see .

Transport delays for each input/output pair, specified as the comma-separated pair consisting of `'IODelay'` and a numeric array.

• For continuous-time models, specify `'IODelay'` in the time units stored in the `TimeUnit` property.

• For discrete-time models, specify `'IODelay'` in integer multiples of the sample time `Ts`. For example, setting `'IODelay'` to `4` specifies a transport delay of four sampling periods.

For a system with Nu inputs and Ny outputs, set `'IODelay'` to an Ny-by-Nu matrix. Each entry is an integer value representing the transport delay for the corresponding input/output pair.

To apply the same delay to all channels, specify `'IODelay'` as a scalar.

You can specify `'IODelay'` as an alternative to the `nk` value. Doing so simplifies the model structure by reducing the number of leading zeros in the B polynomial. In particular, you can represent `max(nk-1,0)` leading zeros as input/output delays using `'IODelay'` instead.

Flag to use integrators in the noise channels, specified as `false(Ny,1)`, a logical scalar, or a logical vector of length Ny, where Ny is the number of outputs.

Setting `IntegrateNoise` to `true` for a particular output results in the model:

`$y\left(t\right)=\frac{B\left(q\right)}{F\left(q\right)}u\left(t-nk\right)+\frac{C\left(q\right)}{D\left(q\right)}\frac{e\left(t\right)}{1-{q}^{-1}}$`

Here, $\frac{1}{1-{q}^{-1}}$ is the integrator in the noise channel,e(t).

## Output Arguments

collapse all

Box-Jenkins polynomial model that fits the estimation data, returned as a discrete-time `idpoly` object. This model is created using the specified model orders, delays, and estimation options.

Information about the estimation results and options used is stored in the `Report` property of the model. `Report` has the following fields:

Report FieldDescription
`Status`

Summary of the model status, which indicates whether the model was created by construction or obtained by estimation.

`Method`

Estimation command used.

`InitialCondition`

Handling of initial conditions during model estimation, returned as one of the following values:

• `'zero'` — The initial conditions were set to zero.

• `'estimate'` — The initial conditions were treated as independent estimation parameters.

• `'backcast'` — The initial conditions were estimated using the best least squares fit.

This field is especially useful to view how the initial conditions were handled when the `InitialCondition` option in the estimation option set is `'auto'`.

`Fit`

Quantitative assessment of the estimation, returned as a structure. See Loss Function and Model Quality Metrics for more information on these quality metrics. The structure has the following fields:

FieldDescription
`FitPercent`

Normalized root mean squared error (NRMSE) measure of how well the response of the model fits the estimation data, expressed as the percentage fitpercent = 100(1-NRMSE).

`LossFcn`

Value of the loss function when the estimation completes.

`MSE`

Mean squared error (MSE) measure of how well the response of the model fits the estimation data.

`FPE`

Final prediction error for the model.

`AIC`

Raw Akaike Information Criteria (AIC) measure of model quality.

`AICc`

Small-sample-size corrected AIC.

`nAIC`

Normalized AIC.

`BIC`

Bayesian Information Criteria (BIC).

`Parameters`

Estimated values of model parameters.

`OptionsUsed`

Option set used for estimation. If no custom options were configured, this is a set of default options. See `bjOptions` for more information.

`RandState`

State of the random number stream at the start of estimation. Empty, `[]`, if randomization was not used during estimation. For more information, see `rng`.

`DataUsed`

Attributes of the data used for estimation, returned as a structure with the following fields.

FieldDescription
`Name`

Name of the data set.

`Type`

Data type.

`Length`

Number of data samples.

`Ts`

Sample time.

`InterSample`

Input intersample behavior, returned as one of the following values:

• `'zoh'` — Zero-order hold maintains a piecewise-constant input signal between samples.

• `'foh'` — First-order hold maintains a piecewise-linear input signal between samples.

• `'bl'` — Band-limited behavior specifies that the continuous-time input signal has zero power above the Nyquist frequency.

`InputOffset`

Offset removed from time-domain input data during estimation. For nonlinear models, it is `[]`.

`OutputOffset`

Offset removed from time-domain output data during estimation. For nonlinear models, it is `[]`.

`Termination`

Termination conditions for the iterative search used for prediction error minimization, returned as a structure with the following fields:

FieldDescription
`WhyStop`

Reason for terminating the numerical search.

`Iterations`

Number of search iterations performed by the estimation algorithm.

`FirstOrderOptimality`

$\infty$-norm of the gradient search vector when the search algorithm terminates.

`FcnCount`

Number of times the objective function was called.

`UpdateNorm`

Norm of the gradient search vector in the last iteration. Omitted when the search method is `'lsqnonlin'` or `'fmincon'`.

`LastImprovement`

Criterion improvement in the last iteration, expressed as a percentage. Omitted when the search method is `'lsqnonlin'` or `'fmincon'`.

`Algorithm`

Algorithm used by `'lsqnonlin'` or `'fmincon'` search method. Omitted when other search methods are used.

For estimation methods that do not require numerical search optimization, the `Termination` field is omitted.

For more information on using `Report`, see Estimation Report.

Estimated initial conditions, returned as an `initialCondition` object or an object array of `initialCondition` values.

• For a single-experiment data set, `ic` represents, in state-space form, the free response of the transfer function model (A and C matrices) to the estimated initial states (x0).

• For a multiple-experiment data set with Ne experiments, `ic` is an object array of length Ne that contains one set of `initialCondition` values for each experiment.

If `bj` returns `ic` values of `0` and the you know that you have non-zero initial conditions, set the `'InitialCondition'` option in `bjOptions` to `'estimate'` and pass the updated option set to `bj`. For example:

```opt = bjOptions('InitialCondition','estimate') [sys,ic] = bj(data,[nb nc nd nf nk],opt)```
The default `'auto'` setting of `'InitialCondition'` uses the `'zero'` method when the initial conditions have a negligible effect on the overall estimation-error minimization process. Specifying `'estimate'` ensures that the software estimates values for `ic`.

For more information, see `initialCondition`. For an example of using this argument, see .Obtain Initial Conditions.

collapse all

### Box-Jenkins Model Structure

The general Box-Jenkins model structure is:

`$y\left(t\right)=\sum _{i=1}^{nu}\frac{{B}_{i}\left(q\right)}{{F}_{i}\left(q\right)}{u}_{i}\left(t-n{k}_{i}\right)+\frac{C\left(q\right)}{D\left(q\right)}e\left(t\right)$`

where nu is the number of input channels.

The orders of Box-Jenkins model are defined as follows:

## Alternatives

To estimate a continuous-time model, use:

## References

[1] Ljung, L. System Identification: Theory for the User, Upper Saddle River, NJ, Prentice-Hall PTR, 1999.

## Version History

Introduced before R2006a

expand all