# chowtest

Chow test for structural change

## Syntax

h = chowtest(X,y,bp)
h = chowtest(Tbl,bp)
h = chowtest(___,Name,Value)
[h,pValue,stat,cValue] = chowtest(___)

## Description

Chow tests assess the stability of coefficients β in a multiple linear regression model of the form y = Xβ + ε. Data are split at specified break points. Coefficients are estimated in initial subsamples, then tested for compatibility with data in complementary subsamples.

example

h = chowtest(X,y,bp) returns test decisions (h) from conducting Chow tests on the multiple linear regression model y = Xβ + ε at the break points in bp.

example

h = chowtest(Tbl,bp) returns test decisions using the data in the tabular array Tbl. The first numPreds columns are the predictors (X) and the last column is the response (y).

example

h = chowtest(___,Name,Value) uses any of the input arguments in the previous syntaxes and additional options specified by one or more Name,Value pair arguments. For example, you can specify which type of Chow test to conduct or specify whether to include an intercept in the multiple regression model.

example

[h,pValue,stat,cValue] = chowtest(___) additionally returns p-values, test statistics, and critical values for the tests.

## Examples

collapse all

Conduct Chow tests to assess whether there are structural changes in the equation for food demand around World War II.

Load the U.S. food consumption data set, which contains annual measurements from 1927 through 1962 with missing data due to the war.

load Data_Consumption

For more details on the data, enter Description at the command prompt.

Suppose that you want to develop a model for consumption as determined by food prices and disposable income, and assess its stability through the economic shock through the war.

Plot the series.

P = Data(:,1); % Food price index I = Data(:,2); % Disposable income index Q = Data(:,3); % Food consumption index figure; plot(dates,[P I Q],'o-') axis tight grid on xlabel('Year') ylabel('Index') title('{\bf Time Series Plot of All Series}') legend({'Price','Income','Consumption'},'Location','SE')

Measurements are missing from 1942 through 1947, which correspond to World War II.

Apply the log transformation to each series.

LP = log(P); LI = log(I); LQ = log(Q);

Assume that log consumption is a linear function of the logs of food price and income. In other words,

${\text{LQ}}_{t}={\beta }_{0}+{\beta }_{1}{\text{LI}}_{t}+{\beta }_{2}\text{LP}+{\epsilon }_{t}.$

${\epsilon }_{t}$ is a Gaussian random variable with mean 0 and standard deviation ${\sigma }^{2}$.

Identify the indices before World War II. Plot log consumption with respect to the logs of food price and income.

preWarIdx = (dates <= 1941); figure scatter3(LP(preWarIdx),LI(preWarIdx),LQ(preWarIdx),[],'ro'); hold on scatter3(LP(~preWarIdx),LI(~preWarIdx),LQ(~preWarIdx),[],'b*'); legend({'Pre-war observations','Post-war observations'},... 'Location','Best') xlabel('Log price') ylabel('Log income') zlabel('Log consumption') title('{\bf Food Consumption Data}') % Get a better view h = gca; h.CameraPosition = [4.3 -12.2 5.3];

Data relationships appear to be affected by the war.

Conduct two break point Chow tests at 5% level of significance. For the first test, set the break point at 1941. Set the break point of the other test at 1948.

bp = find(preWarIdx,1,'last'); h1941 = chowtest([LP LI],LQ,bp) 
h1941 = logical 1 
h1948 = chowtest([LP LI],LQ,bp + 1)
h1948 = logical 0 

h1941 = 1 indicates that there is significant evidence reject the null hypothesis that the coefficients are stable when the break points occur before the war. However, h1948 = 0 indicates that there is not enough evidence to reject coefficient stability if the break point is after the war. This result suggests that the data at 1948 are influential.

Alternatively, you can supply a vector of break points to conduct three Chow tests.

h = chowtest([LP LI],LQ,[bp bp+1]);
RESULTS SUMMARY *************** Test 1 Sample size: 30 Breakpoint: 15 Test type: breakpoint Coefficients tested: All Statistic: 5.5400 Critical value: 3.0088 P value: 0.0049 Significance level: 0.0500 Decision: Reject coefficient stability *************** Test 2 Sample size: 30 Breakpoint: 16 Test type: breakpoint Coefficients tested: All Statistic: 1.2942 Critical value: 3.0088 P value: 0.2992 Significance level: 0.0500 Decision: Fail to reject coefficient stability 

By default, chowtest displays a summary of the test results for each test when you conduct more than one test.

Using the Chow test, assess the stability of an explanatory model of U.S. real gross national product (GNP) using the end of World War II as a break point.

load Data_NelsonPlosser

The time series in the data set contain annual, macroeconomic measurements from 1860 to 1970. For more details, a list of variables, and descriptions, enter Description in the command line.

Several series have missing data. Focus the sample to measurements from 1915 to 1970.

span = (1915 <= dates) & (dates <= 1970);

Assume that an appropriate multiple regression model to describe real GNP is

${\text{GNPR}}_{t}={\beta }_{0}+{\beta }_{1}{\text{IPI}}_{t}+{\beta }_{2}{\text{E}}_{t}+{\beta }_{3}{\text{WR}}_{t}.$

Collect the model variables into a tabular array. Position the predictors in the first three columns, and the response in the last column.

Mdl = DataTable(span,[4,5,10,1]);

Select the index corresponding to 1945, the end of World War II.

bp = find(strcmp(Mdl.Properties.RowNames,'1945'));

Using 1945 as a break point, conduct a break point test to assess whether all regression coefficients are stable.

h = chowtest(Mdl,bp)
h = logical 1 

h = 1 indicates to reject the null hypothesis that the regression coefficients between the subsamples are equivalent.

In addition to returning a test decision, you can request that a test summary display in the Command Window.

h = chowtest(Mdl,bp,'Display','summary');
RESULTS SUMMARY *************** Test 1 Sample size: 56 Breakpoint: 31 Test type: breakpoint Coefficients tested: All Statistic: 11.1036 Critical value: 2.5652 P value: 0.0000 Significance level: 0.0500 Decision: Reject coefficient stability 

Conduct a Chow test to assess the stability of a subset of regression coefficients. This example follows from Test Consumption Model for Structural Change.

Load the U.S. food consumption data set.

load Data_Consumption P = Data(:,1); I = Data(:,2); Q = Data(:,3);

Apply the log transformation to each series.

LP = log(P); LI = log(I); LQ = log(Q);

Identify the indices before World War II.

preWarIdx = (dates <= 1941);

Consider two regression models: one is log consumption onto log food price, and the other is log consumption onto log income. Plot scatter plots and regression lines for both models.

figure; subplot(2,2,1) plot(LP(preWarIdx),LQ(preWarIdx),'bo',LP(~preWarIdx),LQ(~preWarIdx),'r*'); axis tight grid on lsline; xlabel('Log price') ylabel('Log consumption') legend('Pre-war observations','Post-war observations',... 'Location',[0.6,0.6,0.25,0.25]) subplot(2,2,4) plot(LI(preWarIdx),LQ(preWarIdx),'bo',LI(~preWarIdx),LQ(~preWarIdx),'r*'); axis tight grid on lsline xlabel('Log income') ylabel('Log consumption')

A clear break in food price elasticity exists between subsamples before and after the war. However, income elasticity does not appear to have such a break.

Conduct two Chow tests to determine whether there is statistical evidence to reject model continuity for both regression models. Because there are more observations in the complementary subsample than coefficients, conduct a break point test. Consider the elasticities in the test only. That is, specify 0 or false for the intercept (first coefficient), and 1 or true for elasticity (second coefficient).

bp = find(preWarIdx,1,'last'); % Index for 1941 chowtest(LP,LQ,bp,'Coeffs',[0 1],'Display','summary');
RESULTS SUMMARY *************** Test 1 Sample size: 30 Breakpoint: 15 Test type: breakpoint Coefficients tested: 0 1 Statistic: 7.3947 Critical value: 4.2252 P value: 0.0115 Significance level: 0.0500 Decision: Reject coefficient stability 
chowtest(LI,LQ,bp,'Coeffs',[0 1],'Display','summary');
RESULTS SUMMARY *************** Test 1 Sample size: 30 Breakpoint: 15 Test type: breakpoint Coefficients tested: 0 1 Statistic: 0.1289 Critical value: 4.2252 P value: 0.7225 Significance level: 0.0500 Decision: Fail to reject coefficient stability 

The first summary suggests to reject the null hypothesis that price elasticities are equivalent across subsamples at 5% level of significance. The second summary suggests to not reject the null hypothesis that income elasticities are equivalent across subsamples.

Consider a regression model of log consumption onto the logs of price and income. Conduct two break point tests: one that compares price elasticity across subsamples only, and another that compares income elasticity only.

chowtest([LP,LI],LQ,bp,'Coeffs',[0 1 0; 0 0 1]);
RESULTS SUMMARY *************** Test 1 Sample size: 30 Breakpoint: 15 Test type: breakpoint Coefficients tested: 0 1 0 Statistic: 0.0001 Critical value: 4.2597 P value: 0.9920 Significance level: 0.0500 Decision: Fail to reject coefficient stability *************** Test 2 Sample size: 30 Breakpoint: 15 Test type: breakpoint Coefficients tested: 0 0 1 Statistic: 2.8151 Critical value: 4.2597 P value: 0.1064 Significance level: 0.0500 Decision: Fail to reject coefficient stability 

For both tests, there is not enough evidence to reject model stability at 5% level.

Simulate data for a linear model including a structural break in the intercept and one of the predictor coefficients. Then, choose specific coefficients to test for equality across a break point using the Chow test. Adjust parameters to assess the sensitivity of the Chow test.

Specify four predictors, 50 observations, and a break point at period 44 for the simulated linear model.

numPreds = 4; numObs = 50; bp = 44; rng(1); % For reproducibility

Form the predictor data by specifying means for the predictors, and then adding random, standard Gaussian noise to each of the means.

mu = [0 1 2 3]; X = repmat(mu,numObs,1) + randn(numObs,numPreds);

Add a column of ones to the predictor data.

X = [ones(numObs,1) X];

Specify the true values of the regression coefficients and that the intercept and the coefficient of the second predictor jump by 10%.

beta1 = [1 2 3 4 5]'; % Initial subsample coefficients beta2 = beta1 + [beta1(1)*0.1 0 beta1(3)*0.1 0 0 ]'; % Complementary subsample coefficients X1 = X(1:bp,:); % Initial subsample predictors X2 = X(bp+1:end,:); % Complementary subsample predictors

Specify a 2-by-5 logical matrix that indicates to first test the intercept and second regression coefficient, and then test all other coefficients.

test1 = [true false true false false]; Coeffs = [test1; ~test1]
Coeffs = 2x5 logical array 1 0 1 0 0 0 1 0 1 1 

The null hypothesis for the first test (Coeffs(1,:)) is equality of the intercepts and the coefficients of the second predictor across subsamples. The null hypothesis for second test (Coeffs(2,:)) is equality of the first, third, and fourth predictors across subsamples.

Simulate data for the linear model

$\text{y}=\left[\begin{array}{cc}X1& 0\\ 0& X2\end{array}\right]\left[\begin{array}{c}\text{beta1}\\ \text{beta2}\end{array}\right]+\text{innov}.$

Create innov as a vector of random Gaussian variates with mean zero and standard deviation 0.2.

sigma = 0.2; innov = sigma*randn(numObs,1); y = [X1 zeros(bp,size(X2,2)); zeros(numObs - bp,size(X1,2)) X2]*[beta1; beta2]... + innov;

Conduct the two break point tests indicated in Coeffs. Because there is an intercept in the predictor matrix X already, specify to suppress its inclusion in the linear model that chowtest fits.

chowtest(X,y,bp,'Intercept',false,'Coeffs',Coeffs,'Display','summary');
RESULTS SUMMARY *************** Test 1 Sample size: 50 Breakpoint: 44 Test type: breakpoint Coefficients tested: 1 0 1 0 0 Statistic: 5.7102 Critical value: 3.2317 P value: 0.0066 Significance level: 0.0500 Decision: Reject coefficient stability *************** Test 2 Sample size: 50 Breakpoint: 44 Test type: breakpoint Coefficients tested: 0 1 0 1 1 Statistic: 0.2497 Critical value: 2.8387 P value: 0.8611 Significance level: 0.0500 Decision: Fail to reject coefficient stability 

At the default significance level:

• The Chow test correctly rejects the null hypothesis that no structural breaks exist at period bp for the intercept and the second coefficient.

• Correctly failed to reject the null hypothesis for the other coefficients.

Compare the break point test results to the results of the forecast test.

chowtest(X,y,bp,'Intercept',false,'Coeffs',Coeffs,'Test','forecast',... 'Display','summary');
RESULTS SUMMARY *************** Test 1 Sample size: 50 Breakpoint: 44 Test type: forecast Coefficients tested: 1 0 1 0 0 Statistic: 3.7637 Critical value: 2.8451 P value: 0.0182 Significance level: 0.0500 Decision: Reject coefficient stability *************** Test 2 Sample size: 50 Breakpoint: 44 Test type: forecast Coefficients tested: 0 1 0 1 1 Statistic: 0.2135 Critical value: 2.6123 P value: 0.9293 Significance level: 0.0500 Decision: Fail to reject coefficient stability 

In this case, the inferences from the tests are equivalent to those for the break point test.

## Input Arguments

collapse all

Predictor data for the multiple linear regression model, specified as a numObs-by-numPreds numeric matrix.

numObs is the number of observations and numPreds is the number of predictor variables.

Data Types: double

Response data for the multiple linear regression model, specified as a numObs-by-1 numeric vector.

Data Types: double

Combined predictor and response data for the multiple linear regression model, specified as a numObs-by-numPreds + 1 tabular array.

The first numPreds columns of Tbl are the predictor data, and the last column is the response data.

Data Types: table

Break points for the tests, specified as a positive integer or a vector of positive integers.

Each break point is an index of a specific observation (row) in the data. The element bp(j) specifies to split the data into the initial and complementary samples indexed by 1:bp(j) and (bp(j) + 1):numObs, respectively.

Data Types: double

Notes

• NaNs in the data indicate missing values. chowtest removes missing values using list-wise deletion. Removal of rows in the data reduces the effective sample size and changes the time base of the series.

• If bp is a scalar, then the number of tests, numTests, is the common dimension of name-value pair argument values. In this case, chowtest uses the same bp in each test. Otherwise, the length of bp determines numTests, and chowtest runs separate tests for each value in bp.

### Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside quotes. You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: 'Intercept',false,'Test','forecast' specifies to exclude an intercept term from the regression model and to conduct a forecast test.

Indicate whether to include an intercept when chowtest fits the regression model, specified as the comma-separated pair consisting of 'Intercept' and true, false, or a logical vector of length numTests.

ValueDescription
truechowtest includes an intercept when fitting the regression model. numCoeffs = numPreds + 1.
falsechowtest does not include an intercept when fitting the regression model. numCoeffs = numPreds.

Example: 'Intercept',false(3,1)

Data Types: logical

Type of Chow test to conduct, specified as the comma-separated pair consisting of 'Test' and 'breakpoint', 'forecast', or cell vector of character vectors of length numTests.

ValueDescription
'breakpoint' (default)
• chowtest directly assesses coefficient equality constraints using an F statistic.

• Both subsamples must have more than numCoeffs observations.

'forecast'
• chowtest assess forecast performance using a modified F statistic.

• The initial subsample must have more than numCoeffs observations.

For details on the value of numCoeffs, see the 'Intercept' and 'Coeffs' name-value pair arguments.

Example: 'Test',{'breakpoint' 'forecast'}

Data Types: char | cell

Flags indicating which elements of β to test for equality, specified as the comma-separated pair consisting of a logical vector or array. Vector values must be of length numCoeffs. Array values must be of size numTests-by-numCoeffs.

If 'Intercept' contains mixed logical values:

• numCoeffs is numPreds + 1

• chowtest ignores values in the first column of 'Coeffs' for models without an intercept.

For example, suppose the regression model has three predictors (numPreds is 3) in a linear model, and you want to conduct two Chow tests (numTests is 2). Each test includes all regression parameters in the linear model. Also, you want chowtest to fit an intercept in the linear model for the first test only. Therefore, Intercept must be the logical array [1 0]. Because there is at least one model for which chowtest fits an intercept, Coeffs must be a 2-by-4 logical array (numTests is 2 and numCoeffs is numPreds + 1). The elements of Coeffs(:,1) correspond to whether to test the intercept irrespective of its presence in the model. Therefore, one way to specify Coeffs is true(2,4). For the second test, chowtest does not fit an intercept, and so it ignores the value true in Coeffs(2,1). Because chowtest ignores Coeffs(2,1), Coeffs = [true(1,4); false true(1,3)] yields the same result.

The default is true(numTests,numCoeffs), which tests all of β for all tests.

Example: 'Coeffs',[false true; true true]

Nominal significance levels for the tests, specified as the comma-separated pair consisting of 'Alpha' and a numeric scalar or vector of length numTests. All elements of Alpha must be in the interval (0,1).

Example: 'Alpha',[0.5 0.1]

Data Types: double

Flag indicating whether to display test results in the command window, specified as the comma-separated pair consisting of 'Display' and 'off' or 'summary'.

ValueDescriptionDefault Value When
'off'No displaynumTests = 1
'summary'For each test, display test results to the command windownumTests > 1

Example: 'Display','off'

Data Types: char | string

Notes

• chowtest expands scalar and character vector input argument values, other than 'Display', to the size of numTests. Vector values and 'Coeffs' arrays must share a common dimension, equal to numTests.

• If any of bp, Intercept, Test, or Alpha are row vectors, then all output arguments are row vectors.

## Output Arguments

collapse all

Test decisions, returned as a logical scalar or logical vector of length numTests.

The null hypothesis (H0) of the Chow test is that the coefficients (β) selected by Coeffs are identical across subsamples.

• 1 indicates rejection of H0.

• 0 indicates failure to reject H0.

p-values, returned as a numeric scalar or vector of length numTests.

Test statistics, returned as a numeric scalar or vector of length numTests. For details, see Chow Tests.

Critical values for the tests, returned as a numeric scalar or vector of length numTests. Alpha determines the critical values.

collapse all

### Chow Tests

Chow tests assess the stability of the coefficients (β) in a multiple linear regression model of the form y = Xβ + ε. Chow (1960) introduces two variations: the break point and forecast tests [1].

The break point test is a standard F test from the analysis of covariance. The forecast test makes use of the standard theory of prediction intervals. Chow’s contribution is to place both tests within the general linear hypothesis framework, and then to develop appropriate test statistics for testing subsets of coefficients (see Coeffs). For test-statistic formulae, see [1].

## Tips

• Chow tests assume continuity of the innovations variance across structural changes. Heteroscedasticity can distort the size and power of the test. You should verify the innovations-variance-continuity assumption holds before using the test results for inference.

• If both subsamples contain more than numCoeffs observations, then you can conduct a forecast test instead of a break point test. However, the forecast test might have lower power relative to the break point test [1]. Nevertheless, Wilson (1978) suggests conducting the forecast test in the presence of unknown specification errors .

• You can apply the forecast test to cases where both subsamples have size greater than numCoeffs, where you would typically apply a breakpoint test. In such cases, the forecast test might have significantly reduced power relative to a break point test [1]. Nevertheless, Wilson (1978) suggests use of the forecast test in the presence of unknown specification errors [4].

• The forecast test is based on the unbiased predictions, with zero mean error, which result from stable coefficients. However, zero mean forecast error does not, in general, guarantee coefficient stability. Thus, forecast tests are most effective in checking for structural breaks, rather than model continuity [3].

• To obtain diagnostic statistics for each subsample, such as regression coefficient estimates, their standard errors, error sums of squares, and so on, pass the appropriate data to fitlm. For details on working with LinearModel model objects, see Multiple Linear Regression.

## References

[1] Chow, G. C. “Tests of Equality Between Sets of Coefficients in Two Linear Regressions.” Econometrica. Vol. 28, 1960, pp. 591–605.

[2] Fisher, F. M. “Tests of Equality Between Sets of Coefficients in Two Linear Regressions: An Expository Note.” Econometrica. Vol. 38, 1970, pp. 361–66.

[3] Rea, J. D. “Indeterminacy of the Chow Test When the Number of Observations is Insufficient.” Econometrica. Vol. 46, 1978, p. 229.

[4] Wilson, A. L. “When is the Chow Test UMP?” The American Statistician. Vol. 32, 1978, pp. 66–68.