Multivariate regression betas stock returns
Show older comments
I have a matrix of stock returns, 4280 (rows) by 7379 (columns).
I need to regress the 1st to 15th cell in column 3 (1,3) against the 1st to 15th cells in columns 1 and 2 (1,1), (1,2).
Then the 2nd to 16th cell in column 3 (2,3) against the 2nd to 16th cells in columns 1 and 2 (2,1) and (2,2) etc. So like a rolling window.
The columns remain constant, but each variable window (i:i+14) goes down by one row.
Once the whole column is done, it needs to go back up to the 4th column, but continue regressing against columns 1 and 2. Then the 5th column against columns 1 and 2, etc.
This is the code I have so far:
Nwin = 15; % set window for rolling regression
betasMkt = ret.*NaN; betasMkt(:,1) = ret (:,1); % initialize empty matrix for Market betas
betasDisp = ret.*NaN; betasDisp(:,1) = ret (:,1); % initialize empty matrix for CSAD betas
for j = 1:size(ret,2)-2
for i = 1:size(ret,1)-Nwin
vars = [ret(i:i+Nwin-1,j+2) ret(i:i+Nwin-1,1) ret(i:i+Nwin-1,2)];
vars = vars(~isnan(vars(:,1)),:);
vars = vars(~isnan(vars(:,2)),:);
vars = vars(~isnan(vars(:,3)),:);
if size(vars,1) > 15
y = vars(:,1);
x = [ones(rows(y),1) vars(:,2:3)];
b = regress(y,x);
betasMkt(i+Nwin,j+2) = b(2,1); % market beta
betasDisp(i+Nwin,j+2) = b(3,1); % dispersion beta
end
end
end
However, the betasMkt and betasDisp outputs are both just column 1 of 'ret'. It doesn't seem as if any regression has been performed.
Could anyone see where I am going wrong please?
I desperately need to figure this out soon.
Thank you for your time
4 Comments
dpb
on 5 Jul 2019
...
vars = [ret(i:i+Nwin-1,j+2) ret(i:i+Nwin-1,1) ret(i:i+Nwin-1,2)];
vars = vars(~isnan(vars(:,1)),:);
vars = vars(~isnan(vars(:,2)),:);
vars = vars(~isnan(vars(:,3)),:);
...
What is the intent of the above? To try to remove any NaN observations from the vars array before fitting I presume? If so, try
vars = [ret(i:i+Nwin-1,j+2) ret(i:i+Nwin-1,1) ret(i:i+Nwin-1,2)];
vars(any(isnan(vars,2)),:)=[];
dpb
on 5 Jul 2019
Attach the dataset you're trying to fit....
Emily Read
on 6 Jul 2019
Edited: dpb
on 6 Jul 2019
From my reading of the doc, rows() is just size() specifically for a SQL fetch object -- it doesn't do anything different other than return a number. In your use it simply sets the upper bound of the for loop; it is the for loop and the index in there that actually "does something". I think it immaterial which you use unless size() can't read the return object after SQL query.
Accepted Answer
More Answers (1)
dpb
on 6 Jul 2019
Just saw something overlooked last night...
if size(vars,1) > 15
I think there's your problem -- you only calculate a regression if you have more than 15 observations -- but your window size is 15 so can never happen. If the idea is to only compute if there are no missing values, either
if size(vars,1)==Nwin
...
or back when you're looking for whether are missing values, do something like
for j=...
for i = 1:size(ret,1)-Nwin
vars = [ret(i:i+Nwin-1,j+2) ret(i:i+Nwin-1,1) ret(i:i+Nwin-1,2)];
if any(isnan(vars,2)), continue, end % skip to next set if any missing values
...
While your indexing by rows works, my preferred way to reduce visual clutter for such is something like--
i1=1; % initialize indexing variables
i2=Nwin;
for j=...
for i=1:size(ret,1)-Nwin
vars=[ret(i1:i2,[j+2 1:2])];
if any(isnan(vars,2)), continue, end % skip to next set if any missing values
y= vars(:,1);
x= [ones(rows(y),1) vars(:,2:3)];
b= regress(y,x);
betasMkt(i+Nwin,j+2) = b(2,1);
betasDisp(i+Nwin,j+2) = b(3,1);
i1=i1+1; i2=i2+1; % increment counters
end
end
this just increments each index by one--no difference in actual calculation but simpler to look at instead of the computed indices every time.
See if fixing the count test doesn't solve your problem, though...
3 Comments
Emily Read
on 6 Jul 2019
Edited: dpb
on 6 Jul 2019
dpb
on 7 Jul 2019
I'll try to look more closely and at the data file later on tonight...
dpb
on 7 Jul 2019
"They should each produce a matrix the same size as that of the returns (minus 15 rows for the size 15 window). Can you spot anything in my code that might be causing only 1 column to arise? I'd be surprised if there were only 3068 windows that worked."
Well, you changed i2 to 20 from nWin=15 so you've cut the size down there.
Without the full data set can't test for how many cases might be missing but you could put some logic in to count for those.
Categories
Find more on Linear Predictive Coding in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!