Struggling to Improve Neural Network Performance

Question

George Tsitsopoulos on 5 Jul 2016

0
Link

Direct link to this question

https://in.mathworks.com/matlabcentral/answers/293845-struggling-to-improve-neural-network-performance

Answered: Greg Heath on 14 Jul 2016

I am working on creating a function fitting neural network with the neural network toolbox but I haven't had much success getting it to work correctly. I have an input matrix with two features. I currently use fitnet (I've tried cascadeforwardnet/feedforwardnet without much difference) and have two hidden layers, each with 10 neurons. I've been using `trainbr` because it has given me better results than `trainlm`. I'm trying to normalize or standardize the data but haven't had much success. I know that fitnet uses mapminmax by default and I've seen Greg Heath's suggestion that I use zscore to standardize first. The problem is, every time I've used the zscore standardization I haven't gotten very good neural network results. My output needs to be completely positive after de-standardization yet I still get negative values. Because of this, I have used log10 to normalize the data, therefore keeping all of the values positive.

In order to see prediction error, I have found the maximum percent error at any individual output point. I cannot get error lower than 40%, and there are multiple other points with decently high error.

Is there anything else that I can do, whether it be normalization/standardization or network reconfiguration to improve my network performance?

EDIT:

I'm not sure if this is of any help but the regression plot shows that the R = 0.99984 so it seems very accurate.

Thank you for the help,

George

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Greg Heath on 14 Jul 2016

0
Link

Direct link to this answer

https://in.mathworks.com/matlabcentral/answers/293845-struggling-to-improve-neural-network-performance#answer_228702

George Tsitsopoulos about 5 hours ago Hi Greg,

>Why is it that MSE is the best measure of error and >the way I was calculating error is no good? Is it >because the toolbox focuses on minimizing that error >so me trying to minimize a different type of error >is of little help?

That's part of it. The other part is does percent error really make sense in a regression problem?

>I made the changes you suggested and run 10 trials >where each trial has some multiple of 3 hidden neurons >where the multiple is between 3 and 30. Each of these >trials builds & simulates 10 ANNs, each with a random >split of training/testing/validation data. I made it >so that the training data must be between 60% and 93% >of the total input data. I calculate the r squared and >output it to the below 10x10 matrix. Now that I have >this, I see that many of the values in the matrices >are above 0.99. How am I supposed to differentiate >between these values?

Ideally, N is sufficiently large so that the tst results are accurate and UNBIASED while the val results are relatively accurate and only SLIGHTLY BIASED.

Typically, the training goal I use is to minimize H subject to the constraint Rtrnsq > 0.99. I first obtain four 10x10 Rsq matrices for trn, val, tst and all. Next, these are reduced to four 4x10 matrices containing the min, median, mean and maximum Rsq vs H. Finally, the four rows of the four matrices are plotted.

As far as choosing one design, I would favor a net with Rsq > 0.99 at the smallest value of H.

>> Also, is there a way to bound my output so it is always positive? A negative output is impossible in the real world yet the neural net has several points that are output as negative.

>Using a bounded output transfer function will keep the output within bounds. Either TANSIG or LOGSIG will work. The scaling to your data will be done automatically.

>>Whether I use logsig or tansig as the hidden layer transfer function doesn't make a difference in output. I always end up having some values in the NN output be negative, which is impossible for what I'm trying to do. The only thing that's ever guaranteed my output be positive was using log10(target) before training/ simulating and then 10^target afterwards.

You have misinterpreted what I said. Change the OUTPUT TRANSFER FUNCTION!

Hope this helps.

Greg

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Answer 2

Greg Heath on 7 Jul 2016

2
Link

Direct link to this answer

https://in.mathworks.com/matlabcentral/answers/293845-struggling-to-improve-neural-network-performance#answer_227941

Open in MATLAB Online

> I have an input matrix with two features.

 [ I N ] = size(input) % [ 2 N ], N = ?
 [ O N ] = size(target)% [ O N ], O = ?

> I currently use fitnet ... and have two hidden layers, each with 10 neurons.

 A single hidden layer is sufficient.
 With no regularization, number of "H"idden neurons, H, is 
 limited by the number of equations Neq = N*O

> I've been using trainbr`because it has given me better results than trainlm.

TRAINBR uses regularization which mitigates using a large H.

> I'm trying to normalize or standardize the data but haven't had much success. I know that fitnet uses mapminmax by default and I've seen Greg Heath's suggestion that I use zscore to standardize first.

I use zscore in order to detect outliers which may have to be modified or deleted.

> The problem is, every time I've used the zscore standardization I haven't gotten very good neural network results.

Perhaps you misused it. I cannot tell you how without details.

> My output needs to be completely positive after de-standardization yet I still get negative values. Because of this, I have used log10 to normalize the data, therefore keeping all of the values positive.

I don't think log10 is sufficient for positivity. The best way to impose output bounds is to use a bounded output transfer function like logsig or tansig

> In order to see prediction error, I have found the maximum percent error at any individual output point. I cannot get error lower than 40%, and there are multiple other points with decently high error.

If you are using fitnet, the default performance measure is MSE = mse(error). The corresponding scale free measures that are trivial to understand are

        NMSE = MSE/mean(var(target',1)
and
       Rsquare = 1 - NMSE

> Is there anything else that I can do, whether it be normalization/standardization or network reconfiguration to improve my network performance?

Using NMSE and Rsq are more reliable for measuring regression performance. I see no good reason for the log transformation

% EDIT: % % I'm not sure if this is of any help but the regression plot % shows that the R = 0.99984 so it seems very accurate.

Given your log transformation, I'm not sure just what that means. However, my guess is that it is good. Make sure by plotting unnormalized target and output on the same graph.

Hope this helps.

Thank you for formally accepting my answer

Greg

5 Comments
Show 3 older commentsHide 3 older comments

George Tsitsopoulos on 7 Jul 2016

Edited: George Tsitsopoulos on 7 Jul 2016

Open in MATLAB Online

I figured that one hidden layer would be sufficient. I was trying multiple to see if I could get better results.

[ I N ] = size(input) % [ 2 N ], N = 1767
[ O N ] = size(target)% [ O N ], O = 1

How I would use zscore:

% zscore standardization
[normInputs,mean_i,std_i] = zscore(inputs');
[normTargets,mean_t,std_t] = zscore(targets');
% zscore destandardization after training/simulating network
NNoutput = mean_t + std_t.*NNoutput;

--EDIT--

I believe I used zscore wrong. I needed to use zscore before transposing the matrices. I now use zscore as follows:

% zscore standardization
  [normInputs,mean_i,std_i] = zscore(inputs);
  [normTargets,mean_t,std_t] = zscore(targets);
   normInputs = normInputs';
   normTargets = normTargets';
  % zscore destandardization after training/simulating network
  NNoutput = mean_t + std_t.*NNoutput;

--END EDIT--

How I would use log10:

% log10 normalization
normInputs = log10(inputs');
normTargets = log10(targets');
% log10 denormalization after training/simulating network
NNoutput = 10.^NNoutput;

The input and target matrices are transposed because they are input the wrong way. I currently use tansig as the transfer function for the hidden layer.

The way I have been checking the error is by plotting an error graph. I create the error matrix through this line of code

totalError = targetMat - NNoutputMat;

which occurs after I unnormalize the data. I then find the max value in this error matrix and record it. Here is an example of a figure I create after a run using the log10 transformation:

The two plots on the bottom show the error at each set of data i.e the error at each fit point of the 90% of points used for training and the error at each fit point of the 10% of points used for testing.

Greg Heath on 12 Jul 2016

Open in MATLAB Online

> Unfortunately, there is still error around 2000% at certain points.

For regression, the only type of measure that makes sense is one that is linear w.r.t. MSE, the measure that you are trying to minimize directly. The most commonly used are the normalized mse

NMSE = mse(error)/mean(var(target',1))

and the corresponding R squared (see Wikipedia)

Rsq = 1 - NMSE

The denominator of NMSE is the smallest mse that could occur from the naive model output = constant. The minimizing solution is output = mean(target')'.

> You said that the number of hidden neurons is limited by the number of hidden equations. In your equation above, N=1767 and O=1 so I can have a maximum of 1767 neurons, correct?

No.

Using the default data division ratios of /0.7/0.15/0.15 yields the number of training examples

Ntrn = N - 2*round(0.15*N) %1237

and corresponding number of TRAINING equations

Ntrneq = Ntrn*O % 1237

To avoid the phenomenon of overfitting when using the default training function TRAINLM, keep the number of unknown weights

Nw = (I +1)*H+(H+1)*O

no greater than Ntrneq. However, for the purpose of numerical stability w.r.t. noise and measurement error, it should be considerably less.

Accordingly, Nw << Ntrneq yields the upper bound and reasonable maximum

H <= Hmax << Hub = (Ntrneq-O)/(I+O+1) = 309

Therefore I would probably first consider the 10 values

h = Hmin:dH:Hmax = 3:3:30

with

Ntrials = 1:10

random sets of initial weights and datadivisions each. Then display the 10x10 array of Rsq as I have done in many posted examples.

> You also said that bayesian regularization limits the number of hidden neurons further.

That is not what I meant. If you have to increase Hmax too much e.g., Hmax >~ Hub/2 ~ 155 then you should consider trainbr whos output is much more insensitive to overfitting.

> Could it be that 15 hidden neurons is too few?

That will be revealed as a product of the double loop search over h and Ntrials.

> Also, is there a way to bound my output so it is always positive? A negative output is impossible in the real world yet the neural net has several points that are output as negative.

Using a bounded output transfer function will keep the output within bounds. Either TANSIG or LOGSIG will work. The scaling to your data will be done automatically.

Hope this helps.

Greg

George Tsitsopoulos on 13 Jul 2016

Hi Greg,

Why is it that MSE is the best measure of error and the way I was calculating error is no good? Is it because the toolbox focuses on minimizing that error so me trying to minimize a different type of error is of little help?

I made the changes you suggested and run 10 trials where each trial has some multiple of 3 hidden neurons where the multiple is between 3 and 30. Each of these trials builds & simulates 10 ANNs, each with a random split of training/testing/validation data. I made it so that the training data must be between 60% and 93% of the total input data. I calculate the r squared and output it to the below 10x10 matrix. Now that I have this, I see that many of the values in the matrices are above 0.99. How am I supposed to differentiate between these values?

>> Also, is there a way to bound my output so it is always positive? A negative output is impossible in the real world yet the neural net has several points that are output as negative.

>Using a bounded output transfer function will keep the output within bounds. Either TANSIG or LOGSIG will work. The scaling to your data will be done automatically.

Whether I use logsig or tansig as the hidden layer transfer function doesn't make a difference in output. I always end up having some values in the NN output be negative, which is impossible for what I'm trying to do. The only thing that's ever guaranteed my output be positive was using log10(target) before training/simulating and then 10^target afterwards.

I appreciate all of your help,

George

Sign in to comment.

Struggling to Improve Neural Network Performance

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

0 Comments
Show -2 older commentsHide -2 older comments

More Answers (1)

5 Comments
Show 3 older commentsHide 3 older comments

See Also

Categories

Tags

Products

Community Treasure Hunt

Struggling to Improve Neural Network Performance

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

0 Comments Show -2 older commentsHide -2 older comments

More Answers (1)

5 Comments Show 3 older commentsHide 3 older comments

See Also

Categories

Tags

Products

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments

5 Comments
Show 3 older commentsHide 3 older comments