# egcitest

Engle-Granger cointegration test

## Description

example

h = egcitest(Y) returns the rejection decision h from conducting the Engle-Granger cointegration test for assessing the null hypothesis of no cointegration among the variables in the multivariate time series Y. egcitest forms test statistics by regressing the response data Y(:,1) onto the predictor data Y(:,2:end), and then tests the residuals for a unit root.

example

[h,pValue,stat,cValue] = egcitest(Y) also returns the p-value pValue, test statistic stat, and critical value cValue of the test.

example

StatTbl = egcitest(Tbl) returns the table StatTbl containing variables for the test results, statistics, and settings from conducting the Engle-Granger cointegration test on the variables of the table or timetable Tbl.

The response variable in the regression is the first table variable, and all other variables are the predictor variables. To select a different response variable for the regression, use the ResponseVariable name-value argument. To select different predictor variables, use the PredictorNames name-value argument.

example

[___] = egcitest(___,Name=Value) specifies options using one or more name-value arguments in addition to any of the input argument combinations in previous syntaxes. egcitest returns the output argument combination for the corresponding input arguments.

Some options control the number of tests to conduct. The following conditions apply when egcitest conducts multiple tests:

• egcitest treats each test as separate from all other tests.

• If you specify Y, all outputs are vectors.

• If you specify Tbl, each row of StatTbl contains the results of the corresponding test.

For example, egcitest(Tbl,ResponseVariable="GDP",Alpha=0.025,Lags=[0 1]) chooses GDP as the response variable from the table Tbl and conducts two tests at a level of significance of 0.025. The first test includes 0 lag in the residual regression, and the second test includes 1 lag in the residual regression.

example

[___,reg1,reg2] = egcitest(___) additionally returns the following structures of regression statistics, which are required to form the test statistic:

• reg1 – Statistics resulting from the cointegrating regression of the specified response variable ResponseVariable onto the specified predictor variables PredictorVariables

• reg2 – Statistics resulting from the residual regression implemented by the specified unit root test RReg

## Examples

collapse all

Test a multivariate time series for cointegration using the default values of the Engle-Granger cointegration test. Input the time series data as a numeric matrix.

Load data of Canadian inflation and interest rates Data_Canada.mat, which contains the series in the matrix Data.

series'
ans = 5x1 cell
{'(INF_C) Inflation rate (CPI-based)'         }
{'(INF_G) Inflation rate (GDP deflator-based)'}
{'(INT_S) Interest rate (short-term)'         }
{'(INT_M) Interest rate (medium-term)'        }
{'(INT_L) Interest rate (long-term)'          }

Test the interest rate series for cointegration by using the Engle-Granger cointegration test. Use default options and return the rejection decision and $\mathit{p}$-value.

h = egcitest(Data(:,3:end))
h = logical
0

egcitest uses the $\tau$ test by default, and it fails to reject the null hypothesis (h = 0) of no cointegration among the interest rate series.

Test the interest rate series for cointegration by using the Engle-Granger cointegration test. Use default options and return the rejection decision, $\mathit{p}$-value, $\tau$-test statistic, and critical value.

[h,pValue,stat,cValue] = egcitest(Data(:,3:end))
h = logical
0

pValue = 0.0526
stat = -3.9321
cValue = -3.9563

Conduct the Engle-Granger cointegration test on a multivariate time series using default options, which use the first table variable as the response, all other table variables as predictors, and includes a constant term in the cointegrating regression. Return a table of test results.

dates = datetime(dates,12,31);
TT = table2timetable(DataTable,RowTimes=dates);
TT.Observations = [];

Conduct the Engel-Granger cointegration test by passing the timetable to egcitest and using default options. For the cointegrating regression, egcitest uses the CPI-based inflation rate as the response variable and all other variables in the timetable as predictors.

StatTbl = egcitest(TT)
StatTbl=1×9 table
h       pValue       stat      cValue     Lags    Alpha     Test     CReg      RReg
_____    _________    _______    _______    ____    _____    ______    _____    _______

Test 1    true     0.0023851    -6.2491    -4.7673     0      0.05     {'t1'}    {'c'}    {'ADF'}

StatTbl is a table of test results. The rows correspond to variables in the input timetable TT, and the columns correspond to the rejection decision, and corresponding $\mathit{p}$-value, decision statistics, and specified test options. In this case, the test rejects the null hypothesis in favor of the alternative of cointegration among all the table variables.

By default, egcitest includes all input table variables in the cointegration test. To select a response variable for the cointegrating regression, set the ResponseVariable option. To select predictor variables, set the PredictorVariables option.

Load data of Canadian inflation and interest rates Data_Canada.mat. Convert the table DataTable to a timetable of the interest rate series only.

dates = datetime(dates,12,31);
idxINT = contains(DataTable.Properties.VariableNames,"INT");

TT = table2timetable(DataTable(:,idxINT),RowTimes=dates);
TT.Observations = [];

Plot the interest rate series.

figure
plot(TT.Time,TT.Variables)
legend(series(idxINT),Location="northwest")
grid on

Reproduce row 1 of Table II in [3] by testing for cointegration, specifying the default variable assignments for the cointegrating regression and deterministic terms (response variable ${\mathit{y}}_{1}$ is INT_S, the other interest rates ${\mathit{y}}_{2}$ and ${\mathit{y}}_{3}$ are predictors, and the model has a constant $\mathit{c}$), and specifying the $\tau$ and $\mathit{z}$ tests. Return the cointegrating regression statistics.

[StatTbl,reg] = egcitest(TT,Test=["t1" "t2"]);
StatTbl
StatTbl=2×9 table
h       pValue      stat      cValue     Lags    Alpha     Test     CReg      RReg
_____    ________    _______    _______    ____    _____    ______    _____    _______

Test 1    false    0.052627    -3.9321    -3.9563     0      0.05     {'t1'}    {'c'}    {'ADF'}
Test 2    true     0.020157    -25.454    -22.115     0      0.05     {'t2'}    {'c'}    {'ADF'}

The $\tau$ test (Test 1) fails to reject the null hypothesis, but the $\mathit{z}$ test (Test 2) rejects the null hypothesis in favor of the presence of cointegration.

Plot the estimated cointegrating relation using the regression statistics from the $\mathit{z}$ test ${\mathit{y}}_{1}-\left[\begin{array}{cc}{\mathit{y}}_{2}& {\mathit{y}}_{3}\end{array}\right]\left[\begin{array}{c}{\mathit{b}}_{1}\\ {\mathit{b}}_{2}\end{array}\right]-\mathit{Xa}$, where $\mathit{Xa}=\mathit{c}$.

c = reg(2).coeff(1);
b = reg(2).coeff(2:3);

figure
plot(TT.Time,TT.Variables*[1; -b] - c)
grid on

## Input Arguments

collapse all

Data representing observations of a multivariate time series yt, specified as a numObs-by-numDims numeric matrix. Each column of Y corresponds to a variable, and each row corresponds to an observation. The test regresses the response variable Y(:,1) on the predictor variables Y(:,2:end).

Data Types: double

Data representing observations of a multivariate time series yt, specified as a table or timetable with numObs rows. Each row of Tbl is an observation.

The test regresses the response variable, which is the first variable in Tbl, on the predictor variables, which are all other variables in Tbl. To select a different response variable for the regression, use the ResponseVariable name-value argument. To select different predictor variables, use the PredictorNames name-value argument. The selected variables must be numeric.

Note

egcitest removes, from the specified data, all observations containing at least one missing observation, represented by a NaN value.

### Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: egcitest(Tbl,ResponseVariable="GDP",Alpha=0.025,Lags=[0 1]) chooses GDP as the response variable from the table Tbl and conducts two tests at a level of significance of 0.025. The first test includes 0 lag in the residual regression, and the second test includes 1 lag in the residual regression.

Cointegrating regression form, specified as the name of a form, or a string vector or cell vector of form names.

In general, cointegrating regression is

${y}_{1}=Xa+{Y}_{2}b+\epsilon$

where y1 is the response variable, Y2 contains the predictor variables, and X is a design matrix for optional deterministic coefficients a, including a constant, linear time trend, and quadratic time trend. This table contains the supported forms and their names.

Form NameDescription
"nc"The regression does not include X; no constant or trends.
"c"X contains a variable for the constant, but not for the trends.
"ct"X contains variables for the constant and the linear time trend.
"ctt"X contains variables for the constant, linear time trend, and quadratic time trend.

egcitest conducts a separate test for each form name in CReg.

Example: CReg=["ct" "ctt"] includes a constant and linear time trend terms in the cointegrating regression for the first test, and then includes all three deterministic terms in the cointegrating regression for the second test.

Data Types: char | string | cell

Cointegrating-regression coefficient equality constraints, specified as the numeric vector [a; b] or cell vector of such numeric vectors.

a contains the equality constraints of the deterministic terms in the cointegrating regression. The length of a depends on the corresponding value of the CReg name-value argument, one of 0, 1, 2, or 3. For coefficients in the regression, their order in a is constant, linear trend, and quadratic trend.

b contains the numDims − 1 equality constraints for the coefficient of the corresponding predictor variable in Y2.

Specify NaN entries to estimate the corresponding coefficient in the regression.

When CVec is completely specified (does not contain any NaN values), egcitest does not perform the cointegrating regression.

By default, CVec is a completely unspecified cointegrating vector (completely composed of NaN values). Consequently, egcitest estimates all coefficients.

egcitest conducts a separate test for each set of equality constraints in CVec.

Example: egcitest(Tbl,CVec=[2 NaN NaN]) fixes the constant in the cointegrating regression to 2 and estimates the coefficients of the two predictor variables in Tbl.

Example: egcitest(Tbl,CVec={[2 NaN NaN]; nan(3,1)), for the first test, fixes the constant in the cointegrating regression to 2 and estimates the coefficients of the two predictor variables in Tbl, and for the second test, estimates all coefficients.

Example: egcitest(Tbl,CReg="ctt",CVec=[2 0.5 0.25 NaN NaN]) fixes the constant to 2, the linear trend to 0.5, and the quadratic trend to 0.25, and estimates the coefficients of the two predictor variables in Tbl.

Data Types: double | cell

Residual regression form, specified as the name of a form, or a string vector or cell vector of form names.

Form NameDescription
"pp"Phillips-Perron test (pptest) of residuals from the cointegrating regression

egcitest computes test statistics by calling adftest and pptest with the setting Model="AR". This setting requires residuals from appropriately demeaned and detrended data, which is specified by the cointegrating-regression form CReg.

egcitest conducts a separate test for each form name in RReg.

Example: CReg=["adf" "pp"] performs the augmented Dickey-Fuller test for the residual regression of the first test, and then performs the Phillips-Perron test for the residual regression of the second test.

Data Types: char | string | cell

Number of lags in the residual regression, specified as a nonnegative integer or vector of nonnegative integers. The meaning of Lags depends on the value of the RReg name-value argument. For more details, see the Lags argument of the adftest and pptest functions.

egcitest conducts a separate test for each element in Lags.

Example: Lags=[0 1] includes no lags in the residual regression for the first test, and then includes one lag for the residual regression for the second test.

Data Types: double

Test statistic type from residual regression, specified as test name, or a string vector or cell vector of test names. This table contains the supported test names.

Test NameDescription
"t1"τ test
"t2"z test

For more details, see the Test argument of the adftest and pptest functions.

egcitest conducts a separate test for each element in Test.

Example: Test=["t1" "t2"] computes the τ test from the residual regression for the first test, and then computes the z test from the residual regression for the second test.

Data Types: char | cell | string

Nominal significance level for the hypothesis test, specified as a numeric scalar between 0.001 and 0.999 or a numeric vector of such values.

egcitest conducts a separate test for each value in Alpha.

Example: Alpha=[0.01 0.05] uses a level of significance of 0.01 for the first test, and then uses a level of significance of 0.05 for the second test.

Data Types: double

Variable in Tbl to use for response in the cointegrating regression, specified as a string vector or cell vector of character vectors containing variable names in Tbl.Properties.VariableNames, or an integer or logical vector representing the indices of names. The selected variables must be numeric.

egcitest uses the same specified response variable for all tests.

Example: ResponseVariable="GDP"

Data Types: double | logical | char | cell | string

Variables in Tbl to use for the predictors in the cointegrating regression, specified as a string vector or cell vector of character vectors containing variable names in Tbl.Properties.VariableNames, or an integer or logical vector representing the indices of names. The selected variables must be numeric.

egcitest uses the same specified predictors for all tests.

By default, egcitest uses all variables in Tbl that is not specified by the ResponseVariable name-value argument.

Example: DataVariables=["UN" "CPI"]

Example: DataVariables=[true true false false] or DataVariables=[1 2] selects the first and second table variables.

Data Types: double | logical | char | cell | string

Note

• When egcitest conducts multiple tests, the function applies all single settings (scalars or character vectors) to each test.

• All vector-valued specifications that control the number of tests must have equal length.

• If you specify the matrix Y and any value is a row vector, all outputs are row vectors.

• A lagged and differenced time series has a reduced sample size. Absent presample values, if the test series yt is defined for t = 1,…,T, the lagged series yt– k is defined for t = k+1,…,T. The first difference applied to the lagged series yt– k further reduces the time base to k+2,…,T. With p lagged differences, the common time base is p+2,…,T and the effective sample size is T–(p+1).

## Output Arguments

collapse all

Test rejection decisions, returned as a logical scalar or vector with length equal to the number of tests. egcitest returns h when you supply the input Y.

• Values of 1 indicate rejection of the null hypothesis in favor of the alternative of cointegration.

• Values of 0 indicate failure to reject the null hypothesis.

Test statistic p-values, returned as a numeric scalar or vector with length equal to the number of tests. egcitest returns pValue when you supply the input Y.

The p-values are left-tailed probabilities.

Test statistics, returned as a numeric scalar or vector with length equal to the number of tests. egcitest returns stat when you supply the input Y.

The RReg and Test settings of a particular test determine the test statistic. For more details, see adftest and pptest.

Critical values, returned as a numeric scalar or vector with length equal to the number of tests. egcitest returns cValue when you supply the input Y. The critical values are for left-tailed probabilities.

Because egcitest estimates the residuals (that is, residuals are unobserved), critical values are different from those used in adftest or pptest (unless the cointegrating vector is completely specified by the CVec setting). egcitest loads tables of critical values from the file Data_EGCITest.mat, and then linearly interpolates test critical values from the tables. Critical values in the tables derive from methods described in [3].

Test summary, returned as a table with variables for the outputs h, pValue, stat, and cValue, and with a row for each test. egcitest returns StatTbl when you supply the input Tbl.

StatTbl contains variables for the test settings specified by Lags, Alpha, Test, CReg, and RReg.

Cointegrating regression statistics, returned as a structure array. The number of records equal to the number of tests.

egcitest regresses the response variable ResponseVariable onto the predictor variables PredictorVariables using the regression form CReg and specified equality constraints CVec.

Each element of reg1 has the fields in this table. You can access a field using dot notation, for example, reg1(3).coeff contains the coefficient estimates of the third test.

 num Length of input series with NaNs removed size Effective sample size, adjusted for lags and difference names Regression coefficient names coeff Estimated coefficient values se Estimated coefficient standard errors Cov Estimated coefficient covariance matrix tStats t statistics of coefficients and p-values FStat F statistic and p-value yMu Mean of the lag-adjusted input series ySigma Standard deviation of the lag-adjusted input series yHat Fitted values of the lag-adjusted input series res Regression residuals DWStat Durbin-Watson statistic SSR Regression sum of squares SSE Error sum of squares SST Total sum of squares MSE Mean square error RMSE Standard error of the regression RSq R2 statistic aRSq Adjusted R2 statistic LL Loglikelihood of data under Gaussian innovations AIC Akaike information criterion BIC Bayesian (Schwarz) information criterion HQC Hannan-Quinn information criterion

Residual regression statistics, returned as a structure array containing the same fields as reg1. The number of records equal to the number of tests.

egcitest tests the residuals of the cointegrating regression for a unit root by passing the residuals, and the values of Lags and Test, to the test specified by RReg. The tests form the test statistic by a regression of the residuals using specified options. For more details on the test options and the fields of reg2, see adftest or pptest.

## Tips

• To draw valid inferences from the test, determine a suitable value for Lags. For more details, see the adftest Tips and the pptest Tips.

• Samples with less than approximately 20 through 40 observations (depending on the dimension of the data numDims) can yield unreliable critical values, and therefore unreliable inferences. See [3].

• If a test result suggests that the time series are cointegrated, you can use the residuals as data for the error-correction term in a VEC representation of the variables. Follow this procedure:

1. Extract the residuals from the reg1 output (reg1.res).

2. Estimate autoregressive model components using the estimate function of varm, and treat the extracted residual series as exogenous for estimation.

## Alternative Functionality

### App

The Econometric Modeler app enables you to conduct the Engle-Granger cointegration test.

## References

[1] Engle, R. F. and C. W. J. Granger. "Co-Integration and Error-Correction: Representation, Estimation, and Testing." Econometrica. Vol. 55, 1987, pp. 251–276.

[2] Hamilton, James D. Time Series Analysis. Princeton, NJ: Princeton University Press, 1994.

[3] MacKinnon, J. G. "Numerical Distribution Functions for Unit Root and Cointegration Tests." Journal of Applied Econometrics. Vol. 11, 1996, pp. 601–618.

## Version History

Introduced in R2011a