Running several regressions and storing all the data with regress

2 views (last 30 days)
Dear MATLAB expers,
I'm trying to run many linear regressions each one consisting of one dependent variable and an independent variable, but I keep on stumbling upon the following error:
Unable to perform assignment because the left and right sides have a different number of elements.
Error in code (line 739)
tStat(i) = bint; % t-Statistic
It is an error concerning the size of 'tStat', 'Residuals', 'Outliers' and 'RegressStats' and I don't know how to fix it.
[NumRows, NumSeries] = size(stockReturns);
NumAssets = NumSeries - 2;
% Specifying start and end date of the calculation of alphas and betas
StartDate = datestr(stockReturnsDates(1));
EndDate = datestr(stockReturnsDates(end));
%
Alpha = NaN(1, length(NumAssets));
Beta = NaN(1, length(NumAssets));
tStat = zeros(1818, 2, 2);
Residuals = NaN(length(NumAssets), 2);
Outliers = NaN(length(NumAssets), 2);
RegressStats = NaN(length(NumAssets), 2);
for i = 1:NumAssets
% Set up separate asset data and design matrices
DependentVariable = zeros(NumRows,1);
IndependentVariable = zeros(NumRows,2);
DependentVariable(:) = stockReturns(:,i) - stockReturns(:,1827); % Excess return of each asset
IndependentVariable(:,1) = 1.0; % For calculating alpha
IndependentVariable(:,2) = stockReturns(:,1826) - stockReturns(:,1827); % Excess market return
% Estimate the CAPM for each asset separately.
[b, bint, r, rint,stats] = regress(DependentVariable, IndependentVariable, 5);
% Including values in matrixes
Alpha(i) = b(1); % Intercept
Beta(i) = b(2); % Betas
tStat(i) = bint; % t-Statistic
Residuals(i) = r; % Residuals
Outliers(i) = rint; % Outliers
RegressStats(i) = stats; % Includes R^2, F-stat and p-value
end
I would really appreciate your help since I've been stuck with this error for a while already.

Accepted Answer

Dave B
Dave B on 12 Oct 2021
Edited: Dave B on 12 Oct 2021
You're getting this error because you're trying to store the result of bint (which is a 2 x 2 matrix) in a scalar location (tStat(i)). Note that bint is not the t-statistic, it's the upper and lower confidence intervals of your 2 predictors.
Maybe you wanted tStat(i,:,:) = bint(:)' (which seems to correspond to how you initialized the variable)? But I question your choice of name for this variable, and also would not recommend storing it in this shape of matrix (if anything consider 2 x 2 x 1818 (in which case tStat(:,:,i) = bint)...but what are you planning on doing with these values? It's likely more sensible to keep them in a 1818 x 4 matrix.
You'll run into the same problem with storing the remaining values: you need to specify a storage location that matches the size of the values you're going to store. r, rint, and stats will all be non-scalar. Consult the documentation to determine what size to expect, or run the regression on one column of your data to see what shape it will be. If you're really desparate and just want to store everything blindly without thinking about the size/shape, use a struct or cell array...though I'd discourage this as you're just postponing the inevitable problem of having to think about the size (and meaning) of the results of your regression.
A few other things I notice in your code:
  • You're initializing all of your variables based on length(NumAssets) but you probably want to be using NumAssets (because length(NumAssets) is 1)
  • You specified 5 as the third argument to regress. Looking at the documentation, the third argument (alpha) should be specified as a value between 0 and 1 (I'm thinking you wanted .05)
  3 Comments
Dave B
Dave B on 12 Oct 2021
@chiefjia - yes I think you have set up your independent and dependent variables appropriately for slope and intercept. When you are unsure in these case, the best way to find out is to explore with some simple test data where you know what it will look like, and I generally like to use plots to visualize:
x=(1:100)';
y=10+x+randn(size(x))*5;
b = regress(y, [ones(size(x)) x]);
xi = [1 100];
yi = b(1) + xi.*b(2);
scatter(x,y,'.')
hold on
plot(xi,yi)
Now you can see that the use of regress is as you expect, and you can also interrogate all of the output arguments and make sure you understand them (if you like).
I agree that storing this in three dimensions is going to be a pain. Your code snippet would work (as in it will run) if you added a (:)
tStat = zeros(4, NumAssets);
tStat(:, i) = bint(:); % Confidence intervals
The (:) says turn this into a column vector regardless of its shape. Each colum of tStat would correspond to the columns of bint appended to eachother. That's a little weird, because it means it's intercept-low, slope-low, intercept-high, slope-high. Let's see that in the example from above:
[b, bint] = regress(y, [ones(size(x)) x]);
bint
bint = 2×2
10.0879 13.4807 0.9419 1.0003
bint(:)
ans = 4×1
10.0879 0.9419 13.4807 1.0003
That's a fine way to store it, but I personally wouldn't like it. For me it's much more natural to keep the CIs together:
bint([1 3 2 4])'
ans = 4×1
10.0879 13.4807 0.9419 1.0003
chiefjia
chiefjia on 12 Oct 2021
Dear Dave,
thanks a lot for your feedback, I really appreciate it! I know get a better understanding of how this function works and the use of parentheses and colons :)

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!