Fitting data using polyfit, nlfit ?

I have 2 sets of data of the same size (X1, Y1) and (X2,Y2). (X1,Y1) has a linear relationship with zero intercept (Y1=slope*X1). How can I estimate the slope, R^2 value and plot it along with 95% confidence interval lines? Using Corrcoeff, polyfit and polyval gives me a relationship of the form Y1=slope*X1 + c, but I would like to force c=0.
(X2,Y2) has a non-linear relationship of the form Y2=A-Bexp(C*X2). Likewise, I would like to determine A,B & C along with R^2 value and plot it along with 95% confidence interval lines. I know I have to use nlinfit, nlparci but I am new in this and would need your help.
Thanks much.

 Accepted Answer

Star Strider
Star Strider on 1 Oct 2012
Edited: Star Strider on 1 Oct 2012
You seem to be acquainted with nlinfit and know how to use it, but apparently not with Anonymous Functions.
I suggest you use nlinfit for both problems. With B = [B(1) ... B(n)]' a column vector of parameters, your Y1 function becomes:
Y1 = @(B,x) B.*x;
since with only one parameter in it, you do not need to use subscripts.
Your Y2 function becomes:
Y2 = @(B,x) B(1)-B(2).*exp(B(3).*x);
These should work as your modelfun functions in nlinfit.

15 Comments

Can you comment on how to define and use the modelfun in nlinfit and then estimate the values of the parameters B, from known values of X,Y.
Thanks a bunch!
According to the nlinfit documentation, in order to get everything you will need for nlparci and nlpredci if you want to use them later, call nlinfit this way:
For Y1:
Y1 = @(B,x) B.*x;
Y1_beta0 = rand;
[Y1_beta,Y1_R,Y1_J,Y1_CovB,Y1_MSE] = nlinfit(Y1_X, Y1_Y, Y1, Y1_beta0)
For Y2:
Y2 = @(B,x) B(1)-B(2).*exp(B(3).*x);
Y2_beta0 = rand(3,1);
[Y2_beta,Y2_R,Y2_J,Y2_CovB,Y2_MSE] = nlinfit(Y2_X, Y2_Y, Y2, Y2_beta0)
The values returned as Y1_beta and Y2_beta are the estimated parameters for Y1 and Y2 respectively. To then get the Y values the fitted parameters produce with your functions, calculate them this way:
Y1_est = Y1(Y1_beta,Y1_X)
and
Y2_est = Y2(Y2_beta,Y2_X)
You can them plot them as:
plot(Y1_X, Y1_est)
and
plot(Y2_X, Y2_est)
respectively.
There are links to nlparci and nlpredci at the end of the nlinfit documentation.
Appreciate much. When I ran your code, it gave 2 warnings: Warning: The Jacobian at the solution is ill-conditioned, and some model parameters may not be estimated well (they are not identifiable). Use caution in making predictions. Warning: Matrix is close to singular or badly scaled. Results may be inaccurate. RCOND = 9.416437e-19.
However, I normalized the data and it ran just fine. I am yet to find a good way to plot as it looks jagged: http://www.4shared.com/photo/MbZBImhO/ExponentialIncreasingFit_DELET.html
I am not sure what you mean by ‘normalized’, since scaling your data can change your parameter estimates and their related statistics.
I assumed your Y2_X data were monotonically increasing. In your situation, I suggest:
Y2_X0 = linspace(min(Y2_X),max(Y2_X));
Y2_est = Y2(Y2_beta,Y2_X0)
plot(Y2_X0, Y2_est)
This produces 100 data points for Y2_X0 from min(Y2_X) to max(Y2_X), including the limits.
In my data, Y2_X was not sorted earlier and I did it now.
[NewY2_X,ind]=sort(Y2_X); NewY2_Y=Y2_Y(ind); [Y2_beta,Y2_R,Y2_J,Y2_CovB,Y2_MSE] = nlinfit(NewY2_X, NewY2_Y, Y2, Y2_beta0);
The warning still comes up: Warning: Iteration limit exceeded. Returning results from final iteration. Warning: The Jacobian at the solution is ill-conditioned, and some model parameters may not be estimated well (they are not identifiable). Use caution in making predictions.
You can increase the iteration limit by creating a option structure with statset and setting 'MaxIter',5000 (or whatever maximum number of iterations you want). There is a link to statset in the options section of the nlinfit documentation.
I do not know what your data are, so unfortunately I cannot help you with the problem of your Jacobian being ill-conditioned. That usually occurs if the X and Y values vary by significant orders of magnitude. (In your Y2 function, B(1) and B(2) depend only on Y, and B(3) depends only on X, so you might be able to scale to get a good fit. You will have to take into account the effects of scaling on the parameter values and their related statistics.) If you want to use an analytic Jacobian, see those options in statset and use this function:
Y2_J = @(B,x)[1.0, -exp(B(3).*x), -B(2).*x.*exp(B(3).*x)];
This avoids the finite-difference approximation. At the very least it will speed up your calculations.
Yes, X & Y values vary by 3 orders of magnitude. Did you mean Y2_J = @(B,x) B(1)-B(2).*exp(B(3).*x) ? I tried this with 'MaxIter' increased to 500000, but that did not help either. When I tried this Y2_J, this error message came up: The function you provided as the MODELFUN input has returned Inf or NaN values. I can ship the data, if there is a way. Thanks much
Forget the Jacobian. Apparently in 2012b supplying an analytic Jacobian to nlinfit is no longer possible. (It was possible through 2012a.) Instead, add to your statset statement: 'RobustWgtFun','bisquare'. That may help.
A magnitude difference of only 1E+3 should not be enough to cause a singular Jacobian in most situations. It might help if you gave your function different starting guesses, perhaps something like:
Y2_beta0 = rand(3,1)*100;
Your function could be stuck in a local minimum, not an uncommon problem in nonlinear regression. Experiment with various initial guesses until you get a good function fit.
I have Matlab2012a. When I tried 'RobustWgtFun','bisquare' in the statset, similar warnings came up. I don't quite get why Jacobian =0 when the data varies over a large range and I can see clearly using Excel that there is an exponentially increasing relationship. Thanks a bunch!
Warning: Some columns of the Jacobian are effectively zero at the solution, indicating that the model is insensitive to some of its parameters. That may be because those parameters are not present in the model, or otherwise do not affect the predicted values. It may also be due to numerical underflow in the model function, which can sometimes be avoided by choosing better initial parameter values, or by rescaling or recentering. Parameter estimates may be unreliable.
Warning: Matrix is close to singular or badly scaled. Results may be inaccurate. RCOND = 1.115790e-157.
Since the Jacobian is the matrix of derivatives of the function with respect to each parameter, it means that the function is in a very flat region of the response surface.
How large is your data set? It might help if you posted it or a representative sample of it (perhaps 20 X,Y pairs) here.
I have copied the 20 values of (X2,Y2)... they are not sorted.
X2=[ 2.996 6.521 3.050 1.453 0.495 2.150 0.100 0.100 0.312 7.960 2.600 14.150 6.058 13.690 19.106 8.340 19.980 21.700 10.060 14.720 ];
Y2=[756.00 1050.00 834.00 368.00 204.00 384.00 38.95 174.00 132.00 904.00 806.70 1191.90 1228.00 994.00 1326.00 1154.00 1428.80 1264.50 976.80 1198.00 ];
I managed to get a reasonably good fit with this code (renaming the function F2 to avoid confusing it with Y2):
F2 = @(B,x) B(1)-B(2).*exp(B(3).*x);
OptsF2 = statset('MaxIter', 5000);
beta0 = rand(3,1) .* [500; 100; -1];
[Beta,R,J,CovB,MSE] = nlinfit(X2, Y2, F2, beta0, OptsF2);
X2fit = linspace( min(X2), max(X2) );
Beta
BetaCI95 = nlparci(Beta, R, 'covar', CovB)
[F2_est FitCI95] = nlpredci(F2, X2fit, Beta, R, 'Covar',CovB);
figure(1)
plot(X2, Y2, '*b')
hold on
plot(X2fit, F2_est, '-r', 'LineWidth', 1.5)
plot(X2fit, F2_est-FitCI95+0.01, ':r', 'LineWidth', 1.5)
plot(X2fit, F2_est+FitCI95, ':r', 'LineWidth', 1.5)
hold off
legend('Data', sprintf('\\itf\\rm(\\itx\\rm) = %4.0f - %4.0f \\bf·\\rm \\ite\\rm^{%6.3f \\bf·\\rm \\itx}',Beta), '95% Confidence Intervals', 4)
xlabel('\it\bfx\rm \rightarrow')
ylabel('\bf\itf\rm\bf(\itx\rm\bf) \rightarrow')
text(1, (Beta(1)-Beta(2)), sprintf('\\leftarrow %2.0f', Beta(1)-Beta(2)))
grid
and got these results:
Beta =
1.2316e+003
1.1792e+003
-279.4183e-003
BetaCI95 =
1.1097e+003 1.3535e+003
999.1111e+000 1.3592e+003
-396.3701e-003 -162.4665e-003
So the Beta estimates are good, in the sense that the 95% confidence intervals on all of them do not include zero. Note that all I really did to get the fit was to look at the data and provide reasonable initial guesses for beta0. The beta0 values for this fit were [19; 87; -0.6] so they were not actually close to the fitted Beta estimates. They were close enough to allow nlinfit to quickly estimate an accurate solution.
Thank you for providing with the code. It ran smoothly. The only change I made is in the plot command. F2_est is a row array, while FitCI95 is a column array. So, I made a minor change as: plot(X2fit, F2_est'-FitCI95+0.01, ':r', 'LineWidth', 1.5) plot(X2fit, F2_est'+FitCI95, ':r', 'LineWidth', 1.5)
Thanks a ton for your help.
Is there a way to find the R^2 value for the fit?
Star Strider
Star Strider on 5 Oct 2012
Edited: Star Strider on 5 Oct 2012
R^2: Yes, if you use the NonLinearModel class, particularly NonLinearModel.fit. That will also give you a number of other statistics. I use it occasionally when I need the extra statistics it gives. There is a link to NonLinearModel at the end of the nlinfit documentation.
It is my pleasure to help!

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!