Pairs Trading Code: Further Guidance Please

1 view (last 30 days)
Good afternoon.
I am trying to design a Pairs Trading code: I am using Matlab R2011b version. The dataset I am using is the S&P500, 1982 – 2012 Daily.
The procedure includes:
  1. A distance measure on each 12 month period (formation) sliding by 6 month period (Trading)
  2. Normalize
  3. Do pair wise distance measure (basically a 500, 500 matrix)
  4. Find pairs which minimize the distance measure (the sum of squared deviations between the two normalized price series.)
  5. Filter and identify top ranking pairs formation.
I would like to know:
  1. What are the next coding steps needed to find the pair wise distance measure criterion,
  2. What are the next coding steps to minimise the distance measure?
  3. How do I filter and identify the top ranking pairs integrating the time period sliding?
  4. How would I include a restriction on stocks being matched in the same industry sector according to the relevant SIC codes?
The code completed so far is below inclusive of note headers (%) for simplification.
Any help, guidance or advice would be very much appreciated.
Kind regards
Tomasz Mlynowski
% Import data
[data,text] = xlsread('G:\me\Desktop\Pairs Trading\sp500 price data.xlsm',1);
% Dates in numeric format
dates = datenum(text(4:end,1),'dd/mm/yyyy');
% Keep names (for reference)
names = text(1,2:end);
% Free memory from text (more than 80 MB)
clear text
% Retrieve year and month
[y,m] = datevec(dates);
% Unique pairs of year-month
ym = unique([y,m],'rows');
% Distance measure on each 12 month block sliding by 6 (except last block is 5)
% For reference, appendix: http://www.tinbergen.nl/discussionpapers/11150.pdf
for r = 12:6:numel(y)
% Select the data for the formation period
tmp = data(r-11:r,:);
% 1. Normalization
% Find columns with at least one non NAN value
idxnan = isnan(tmp);
cols = find(~(all(idxnan)));
% LOOP through columns
for c = cols
% First non NaN value
first = find(~idxnan(:,c),1,'first');
tmp(:,c) = tmp(:,c) / tmp(first,c);
end
% Now, don't loop by single column because you need to normalize taking
% into consideration pairs.
% LOOP through columns
% for c = cols
% Now select a specific column and loop against all others
% for cpair = setdiff(cols,c)
% first1 = find(~idxnan(:,c ),1,'first');
% first2 = find(~idxnan(:,cpair),1,'first');
%
% tmp(:,c) = tmp(:,c) / tmp(first,c);
% end
%
%
% First non NaN value
% first = find(~idxnan(:,c),1,'first');
% tmp(:,c) = tmp(:,c) / tmp(first,c);
%
% end
end
%Pairwise distance measure
% Minimum distance criterion identification

Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!