Error Invalid Training Data- Predictors must be a N-by-1 cell array of sequences?
58 views (last 30 days)
Show older comments
I am unable to solve the error Invalid training data. Predictors must be a N-by-1 cell array of sequences, where N is the number of sequences. All sequences must have the same
% Assume 'cnnDataArray' contains CNN-extracted features and 'allLabels' has the corresponding labels.
% Validate that the size of the data matches the labels
%cnnDataArray = cnnDataArray(1:numel(allLabels), :);
% Validate and align data and labels
numSamplesData = size(cnnDataArray, 1);
numSamplesLabels = numel(allLabels);
if numSamplesData > numSamplesLabels
fprintf('Truncating data to match labels.\n');
cnnDataArray = cnnDataArray(1:numSamplesLabels, :);
elseif numSamplesLabels > numSamplesData
fprintf('Truncating labels to match data.\n');
allLabels = allLabels(1:numSamplesData);
end
% Check consistency
if size(cnnDataArray, 1) ~= numel(allLabels)
error('Mismatch persists after alignment. Data samples: %d, Labels: %d.', size(cnnDataArray, 1), numel(allLabels));
end
disp('Data and labels are aligned.');
% Convert labels to categorical if not already
allLabels = allLabels(1:size(cnnDataArray, 1));
% Validate that lstmInput has the correct dimensions
[numSamples, ~, numFeatures] = size(lstmInput);
% Expanding trainDataCell
expandedTrainDataCell = cell(numel(trainDataCell), 1); % Create a new cell array to hold the expanded sequences
for i = 1:numel(trainDataCell)
% Ensure each sequence is correctly formatted
if size(trainDataCell{i}, 1) == 1
% If it's a single time step with F features, no need to reshape, keep it as [1, F]
expandedTrainDataCell{i} = trainDataCell{i}; % Keep as is
else
% If there are multiple time steps, keep the sequence structure intact
expandedTrainDataCell{i} = trainDataCell{i}; % Sequence remains as is
end
end
% Expanding testDataCell similarly
expandedTestDataCell = cell(numel(testDataCell), 1);
for i = 1:numel(testDataCell)
if size(testDataCell{i}, 1) == 1
% If it's a single time step with F features, keep it as is
expandedTestDataCell{i} = testDataCell{i};
else
% Otherwise, keep the sequence structure intact
expandedTestDataCell{i} = testDataCell{i};
end
end
% Check the size of the expanded cells
disp(size(expandedTrainDataCell)); % Should show [N, 1]
disp(size(expandedTestDataCell)); % Should show [M, 1]
% Now you can use these cell arrays directly for LSTM training
% Do not use cell2mat unless you need a matrix of fixed-size sequences
% Continue with training the LSTM using the expanded cell arrays
% Reshape data for LSTM
%lstmInput = reshape(cnnDataArray, [numSamples, 1, numFeatures]); % [numSamples, 1, numFeatures]
% Verify the reshaped data
%disp(size(lstmInput)); % Should display [numSamples, 1, numFeatures]
% Split data into training and testing sets (e.g., 80-20 split)
%cv = cvpartition(allLabels, 'Holdout', 0.2); % Adjust the holdout ratio if necessary
%trainIdx = training(cv);
Answers (1)
Ayush Aniket
on 26 Dec 2024 at 7:09
Edited: Ayush Aniket
on 26 Dec 2024 at 7:10
Based on the error message, the error occurs due to discrepancy between the expectda data format and the format of your data.
The trainNetwork function expects data for 2-D image sequences (as it seems from the information provided) to be in the format of Nx1 cell array where each element is a h-by-w-by-c-by-s arrays, where h, w, and c correspond to the height, width, and number of channels of the images, respectively, and s is the sequence length.
Hence, you training data traindatacell (predictors) must be in the following format: Nx1 cell array where each element is a [1 numFeatures 1 s] array.
Refer the documentation link below to read about the expected format for different type of data: https://www.mathworks.com/help/deeplearning/ref/trainnetwork.html?#mw_36a68d96-8505-4b8d-b338-44e1efa9cc5e
Note: From the code provided, the imageInputLayer expects input in the format [1 numFeatures 1]. However as mentioned in its documentation, the expected format is a row vector of integers [h w c], where h, w, and c correspond to the height, width, and number of channels respectively. Assuming, numfeatures to be the number of channels, you should modify the format to the layer to [1 1 numFeatures]. Refer the documentation here: https://www.mathworks.com/help/deeplearning/ref/nnet.cnn.layer.imageinputlayer.html#mw_342fa7c6-d7c0-456b-bfa5-366256fe67c9
If you are using any other data type, please share the format of the data that you are working with.
See Also
Categories
Find more on Build Deep Neural Networks in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!