Neural-Net​work-Perfo​rmance Paradox (WITH PICTURES!)

Hello,
I observed a strange behaviour of my neural network.
Training:
-> As you see, the performance(MSE) is pretty good in training, validation AND testing. But look what happens, when I test the network with more data, that I cut out from the dataset before training.
Testing:
-> The performance is terrible! You would assume, that the performance is more or less the same as in the training-testset, because both testsets are from the same dataset and doesn't influence the training, but they are not.
Has anybody an explanation?
Kind regards, Detlef

Answers (1)

Rnew looks good but MSEnew looks about 60 times too large. Something is wrong.
MSEnew has the symptoms of overtraining an overfit net. Are there more unknown weights Nw than training equations Ntrneq?
[ I N ] = size(input), [ O N ] = size(target)
Ntst = round(0.15*N), Nval = Ntst,
Ntrn = N - 2*Ntst, Ntrneq = Ntrn*O
For an I-H-O node topology
Nw = ( I + 1 ) * H + ( H + 1 )*O
Ntrneq >= Nw when
H <= (Ntrneq - O ) / ( I + O + 1) % 19442
I assume that you don't have anywhere near that many hidden nodes.
How may do you have?
How much training time?
What was the stopping criterion tr.stop?
Try repeating with other randomizations of the data.
However, with a data set that large, I would try
Nnew = Ntst = Nval = Ntrn
Hope this helps.
Greg

4 Comments

Thank you for your answer!
Right, I only have about 5-25 hidden neurons. The training time differs between 2-30min, depending on the stop criterion, which is almost always 'Validation stop'.
I don't get the point with your code. Could you please explain it with words?
If you have more unknown weights than training equations the net usually does not perform well on non-training data. You do not have this problem. If you did, reducing the number of hidden nodes and/or using a validation subset and/or using regularization (e.g., msereg or trainbr) are available techniques to mitigate the problem.
Search the general NN literature using the search word "OVERFITTING".
Yes ok, I understand.
But I still do not get it why the integrated tool-test performs way better than the extra-test afterwards...
I think you should be more concerned that the original data has
R ~ 0.996 with MSE ~ 45
whereas the new data has
R ~ 0.956 with MSE ~ 2705
It doesn't make sense.
What is the mean variance of both targets?
Remember
R ~ sqrt( 1 - MSE/mean(var(target',1))
ope this helps.
Greg

This question is closed.

Products

Asked:

on 15 Jul 2015

Closed:

on 20 Aug 2021

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!