How to only use training set to train Neural Network using toolbox with "divideInd" option

Question

0 votes

Hello all, currently I am working with the Neural Network toolbox. I used the "Generate Advanced Script" option at the end and made some modifications as to how the network is to divide up my data set into training, validation, and testing. I changed the default option from "dividerand" to "divideInd" and I specified which indices I wanted to be in training, validation, and testing. However, it seems that the training process is using the ENTIRE data set instead of exclusively using the training set I specified earlier. Is there a way around this? Also is there a way to check the confusion matrix for EACH individual set? (meaning confusion matrix using only training, validation, testing set) Below are modifications made to the code generated from NN toolbox:

inputs = datainput; targets = targetvalues;

% Create a Pattern Recognition Network

hiddenLayerSize = 35;

net = patternnet(hiddenLayerSize);

% Choose Input and Output Pre/Post-Processing Functions

% For a list of all processing functions type: help nnprocess

net.inputs{1}.processFcns = {'removeconstantrows','mapminmax'};

net.outputs{2}.processFcns = {'removeconstantrows','mapminmax'};

% Setup Division of Data for Training, Validation, Testing

% For a list of all data division functions type: help nndivide

% *MODIFICATIONS MADE HERE***

net.divideFcn = 'divideind'; % Divide data using indicies

%net.divideMode = 'sample'; % Divide up every sample

net.divideParam.trainInd = 1:300;

net.divideParam.valInd = 301:360;

net.divideParam.testInd = 361:420;

% For help on training function 'trainlm' type: help trainlm

% For a list of all training functions type: help nntrain

net.trainFcn = 'trainlm'; % Levenberg-Marquardt

% Choose a Performance Function

% For a list of all performance functions type: help nnperformance

net.performFcn = 'mse'; % Mean squared error

% Choose Plot Functions

% For a list of all plot functions type: help nnplot

net.plotFcns = {'plotperform','plottrainstate','ploterrhist', ...

'plotregression', 'plotfit'};

% Train the Network

[net,tr] = train(net,inputs,targets);

% Test the Network

outputs = net(inputs);

errors = gsubtract(targets,outputs);

performance = perform(net,targets,outputs)

% Recalculate Training, Validation and Test Performance

trainTargets = targets .* tr.trainMask{1};

valTargets = targets .* tr.valMask{1};

testTargets = targets .* tr.testMask{1};

trainPerformance = perform(net,trainTargets,outputs);

valPerformance = perform(net,valTargets,outputs);

testPerformance = perform(net,testTargets,outputs);

% View the Network

%view(net);

% Plots

% Uncomment these lines to enable various plots.

%figure, plotperform(tr)

%figure, plottrainstate(tr)

%figure, ploterrhist(errors)

Thank you

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Greg Heath on 29 Jul 2013

Open in MATLAB Online

1 vote

 > How to only use training set to train Neural Network using toolbox with "divideInd" option
 > Asked by Gary 22 minutes ago
 > Latest activity Edited by Gary 15 minutes ago
 > Hello all, currently I am working with the Neural Network toolbox. I used the "Generate Advanced Script" option at the end and made some modifications as to how the network is to divide up my data set into training, validation, and testing. I changed the default option from "dividerand" to "divideInd" and I specified which indices I wanted to be in training, validation, and testing.

Your technique results in unnecessary specifications of too many net properties that are already defaults. Concentrate on the defaults that have to be overridden. For example, if you have a classification or pattern recognition problem, first use the default number of hidden nodes and omit the ending semicolon to obtain

net = patternnet % No semicolon

The resulting command line printout will reveal the defaults. You can then concentrate on the defaults you want to override.

It is also useful to run the code examples in the documentation

 help patternnet
 doc patternnet

Be sure the target matrix consists of unit vector columns with a single "1". The row index of the "1" denotes the class index of the corresponding input column. The relationship between the target matrix and the class indices is given by

 target = ind2vec(trueclassindex)
 trueclassindex = vec2ind(target)

Again, omitting some of the ending semicolons will reveal useful information.

 % However, it seems that the training process is using the ENTIRE data set instead of exclusively using the training set I specified earlier. Is there a way around this?

You are confused.

 data = design + test            % test == evaluation
 design = train+ validation    % validation ~= evaluation

The train function designs with design data and evaluates with nondesign test data. It trains with training data, but uses the nontraining validation data to stop training if the validation error does not decrease for max_fail consecutive epochs. Finally, it evaluates the net with nondesign test data.

The separate trn/val/tst performances can be obtained using the training record tr via

[net tr y e ] = train(net,input,target); tr = tr % NO SEMICOLON

For examples, search the NEWSGROUP and ANSWERS using

patternnet greg

% Also is there a way to check the confusion matrix for EACH individual set? (meaning confusion matrix using only training, validation, testing set)

Yes. Call each separately after using the indices to separate the cases. For an example search

confusion greg

Hope this helps.

Thank you for formally accepting my answer

Greg

2 Comments
Show None Hide None

Gary on 29 Jul 2013

Open in MATLAB Online

hello Greg, thanks for the help. Just to clarify what you said, the neural network already knows to discriminate between training, val, testing correct? I was looking at the code and this was not obvious to me.

%Your technique results in unnecessary specifications of too many net properties that are already defaults. Concentrate on the defaults that have to be overridden

I am confused by what you meant. Is there another way to choose "divideInd" instead of the default "divideRand" parameter through the NN gui without too much manual modifications?

Thanks again for the help.

Greg Heath on 29 Jul 2013

Open in MATLAB Online

Defaults that do not have to be explicitly specified

 1.dividerand
 2.trn/val/tst ratios = 0.7/0.15/0.15

TRAIN knows how to separate and use correctly. The corresponding indices and separate results can be obtained from TR.

Similarly if you explicitly override with another divide option (e.g., divideind).

I'm not that familiar with the GUI. However, I don't remember any reasons why defaults should have to be explicity specified.

Sign in to comment.

How to only use training set to train Neural Network using toolbox with "divideInd" option

0 Comments
Show -2 older comments Hide -2 older comments

Accepted Answer

2 Comments
Show None Hide None

More Answers (0)

Categories

Products

Tags

Community Treasure Hunt

How to only use training set to train Neural Network using toolbox with "divideInd" option

0 Comments Show -2 older comments Hide -2 older comments

Accepted Answer

2 Comments Show None Hide None

More Answers (0)

Categories

Products

Tags

See Also

Community Treasure Hunt

0 Comments
Show -2 older comments Hide -2 older comments

2 Comments
Show None Hide None