MATLAB Answers

Diffrence between RMSE selfcalculated and RMSE calculated with Statistics Toolbox

100 views (last 30 days)
Rica
Rica on 5 Aug 2016
Commented: Sairam Seshapalli on 12 Nov 2019
Hi all, i calculated the RMSE of these Data:
Y_hat=[
9.774614325191857
9.453084986417043
9.502166049524247
7.817755496590051
7.031233831915310
8.392026077578970
6.881255539731626
6.488927374899896
6.779374282307657
6.474790314047517
13.842988631876649
13.113764172190285
14.244292841981128
12.470726075747763]
Y=[
8.900000000000000
8.600000000000000
9.167000000000000
7.000000000000000
7.030000000000000
7.270000000000000
7.430000000000000
7.270000000000000
7.370000000000000
7.030000000000000
15.029999999999999
13.170000000000000
13.369999999999999
13.630000000000001]
my calculation is based on the Formula: RMSE= sqrt(mean((Y_hat-Y)^2)). with the calculation i got RMSE=0.7894.
But with the Statistics Toolbox of matlab I got RMSE=0.885 which is the sqrt of my calculated Value!!!. Who is wrong: I or the Toolbox??
Thank you!

  3 Comments

Brendan Hamm
Brendan Hamm on 5 Aug 2016
Using the Statistic Toolbox how? Presumably these data values were not hand entered as when I calculate the RMSE I get:
>> sqrt(mean((Y-Y_hat).^2))
ans =
0.7847
the cyclist
the cyclist on 5 Aug 2016
Just adding a bit more to Brendan's comment ...
Your Y_hat and RMSE are presumably the output of a MATLAB fitting function. Which one, and what specific output do you associate with RMSE? Ideally, you could post all of your data and code, and we could replicate completely.
Rica
Rica on 8 Aug 2016
Hi, Thanks for the comments. I made a multivariate regression wit these Parameters X1 and X2. the function fitlm calculates the regression coeffitionts, r^2 and rmse.
% X1=1.0e+02 *[
4.794100000000000
4.830800000000000
5.043100000000000
4.059800000000000
3.179700000000000
4.608300000000000
3.795500000000000
3.299600000000000
3.431000000000000
3.635300000000000
8.896799999999999
8.344199999999999
8.839100000000000
5.600200000000000
]
%
X2=[33.979999999999997
32.450000000000003
32.310000000000002
26.309999999999999
24.230000000000000
27.989999999999998
22.489999999999998
21.550000000000001
22.649999999999999
20.910000000000000
45.509999999999998
43.130000000000003
47.439999999999998
44.899999999999999]
the Result is Y_hat= 0.5262+0.003757*X1+0.21916*X2. the Code is:
%
X_f=[ones(size(X1)) X1 X2];
X_f_lm=X_f(:,2:end);
mdl=fitlm(X_f_lm,Y,'linear').
I got this:

Sign in to comment.

Accepted Answer

the cyclist
the cyclist on 8 Aug 2016
You calculated the RMSE incorrectly -- and then had a remarkable numerical coincidence.
You calculated
RMSE = sqrt(mean(((Y-Y_hat).^2)))
which is equivalent to
RMSE = sqrt(sum(((Y-Y_hat).^2)/N_obs))
where N_obs is the number of observations. (N_obs = 14 in your case.) You got the value RMSE = 0.7847.
But the correct calculation of RMSE divides by the number of degrees of freedom, not the number of observations. The correct RMSE calculation is
RMSE = sqrt(sum(((Y-Y_hat).^2)/(N_obs-rankX)))
where rankX = 3 in your case.
So,
RMSE = sqrt(sum(((Y-Y_hat).^2)/11))
and is equal to 0.8853 (as MATLAB got).
The numerical coincidence, and complete red herring, is that this is very nearly equal to the square root of your incorrect value.
You can see where (the latest version of) MATLAB does the calculation of MSE around lines 1436-1440 of the file LinearModel.

  3 Comments

the cyclist
the cyclist on 9 Aug 2016
I looked at a couple articles, including this one and this one, and they both reference two different ways of calculating the MSE (specifically mentioning regression).

Sign in to comment.

More Answers (0)

Sign in to answer this question.