Data preparation for time forecasting using LSTM
Show older comments
Hi, I am trying to solve a time forecasting problem using LSTM in Matlab. The questions still remain after going through
(Q1) The problem I am facing is in the data preparation stage. Specifically, I have 5000 samples of time responses of the same response quantity and the number of time steps is 1001. I want to train 90% data (5000 x 901) and keep 10% for the prediction (5000 x 100). At present, I am storing the complete data as a matrix:
data is [5000 x 1001]
dataTrain = data(:,901);
dataTest = data(:,901:end);
Then, standardizing the data
XTrain = dataTrainStandardized(:,1:end-1);
YTrain = dataTrainStandardized(:,2:end);
XTest = dataTestStandardized(:,1:end-1);
Now, what should be the LSTM network architecture as per my data set and problem definition?
numFeatures = ? % I guess number of features should be 1 as it is univariate.
numResponses = ? % I guess this should be the number of training time steps (=901)
However, this gives an error “The training sequences are of feature dimension 5000 but the input layer expects sequences of feature dimension 1.” So, should I store the dataset in a cell (each cell representing 1 feature) and inside the cell a matrix of dimension (no of samples x no of time steps)?
numHiddenUnits = 100;
layers = [ ...
sequenceInputLayer(numFeatures)
lstmLayer(numHiddenUnits)
fullyConnectedLayer(numResponses)
regressionLayer];
(Q2) What does the 'MiniBatchSize' do? Does it divide the time steps (columns) into smaller batches or the number of samples (rows) into smaller batches?
(Q3) The last question is related to the ‘predictAndUpdateState’. Is the following formatting okay?
net = predictAndUpdateState(net,XTrain);
[net,YPred] = predictAndUpdateState(net,YTrain(:,end));
numTimeStepsTest = size(XTest,2); %numel(XTest);
for i = 2:numTimeStepsTest
[net,YPred(:,i)] = predictAndUpdateState(net,YPred(:,i-1),...
'MiniBatchSize',25,'ExecutionEnvironment','auto');
End
This question is somewhat related to Q1.
Accepted Answer
More Answers (1)
Patrick Stettler
on 19 Sep 2023
0 votes
Hi Conor
Your answer was indeed very helpful. I'm still struggling, however, with the data-structuring issue. The experiment I'm trying to solve is akin to the setup you indicated above ("...for example, it could be that I make one recording of 5000 quantities over 900 time steps. In this case, your data corresponds to a 5000-by-900 (training) array").
I have time-series data (S&P500-returns and three indicators), that is feature-dimension 4 and 1000 time steps. I'm trying to predict the direction of the next close (up->1, unchanged->0, down->-1) with a LSTM model in Deep Network Designer app.
I've tried to structure data in several ways but couldn't make Designer work. My approach:
XTrain:
1) 4-by-1000 array (doubles)
2) convert to arraydatastore (as Designer only accepts arraydatastore type);
YTrain:
1) 1-by-1000 array (doubles, i.e. +1, 0, -1)
2) convert to arraydatastore (as Designer only accepts arraydatastore type);
XYTrain:
1) combine XTrain and YTrain with combine(XTrain, YTrain)
In Designer:
1) in InputLayer: InputSize = 4
2) last layer is a classificationLayer
Result:
This leads to several errors. Designer complains, for example, about a) input-data <> InputSize mismatch and b) categorization mismatch.
I couldn't find the answers in the documentation, some hints would be much appreciated, thanks.
3 Comments
Conor Daly
on 19 Sep 2023
Hi Patrick,
I think the trick here is to make sure YTrain is a categorical array. trainNetwork requires classification targets to be specified as categoricals.
In addition, since arrayDatastore chooses IterationDimension=1 by default, we need to tell the array datastores to use the third dimension to iterate over, so that the datastore treats our entire 4x1000 array (1x1000 for the targets) as a single observation. When IterationDimension=1, the software tries to split the 4x1000 predictor array into four 1x1000 arrays -- which isn't want we want here.
Here's some example code with dummy data -- I hope this helps:
XTrain = rand([4 1e3]);
TTrain = categorical(randi([-1 1], [1 1e3]));
dsx = arrayDatastore(XTrain, IterationDimension=3);
dst = arrayDatastore(TTrain, IterationDimension=3);
ds = combine(dsx, dst);
layers = [ sequenceInputLayer(4)
lstmLayer(128)
fullyConnectedLayer(3)
classificationLayer ];
options = trainingOptions("adam", MaxEpochs=10);
net = trainNetwork(ds, layers, options);
Patrick Stettler
on 20 Sep 2023
Many thanks Conor, much appreciated, this makes things clearer now (I was definitely wrong on the IterationDimension (=3)).
I've tried to replicate your (programmatic) setup directly in the Designer-app as follows:
- network type sequence-to-label
- sequenceInputLayer:
- inputSize = 4
- fullyConnectedLayer
- OutputSize = 3
- classificationLayer
- outputSize = 'auto'
- ---
- data: using 'ds' combined datastore as constructed above
- Solver: 'adam'
Result:
The problem seems to be, that when using Designer, the responses also need to be structured as 3x1000. Alternatively, one would need to tell Designer's classificationLayer to set outputSize=1 (my hypothesis), making it fit the 'ds' datastore as is. Or how/where else would one instruct Designer to work with the as-is 'ds' datastore?
Thanks for enlightment, Patrick
Conor Daly
on 1 Oct 2023
Thans Patrick! I'm sorry it's still not working. It's not really clear to me what's going on -- would you be able to share your code (with dummied/randomized data)?
Categories
Find more on Deep Learning Toolbox in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!