MATLAB Answers

Curve fitting and convergence to estimate two coefficients

115 views (last 30 days)
Anand Rathnam
Anand Rathnam on 20 Sep 2021
Commented: William Rose on 13 Oct 2021 at 2:26
Hello
Question Objective:
I am looking to employ curve fiting and convergence to estimate two coefficients in my model equation. I have estimated one coefficient by lscurvefit, fminsearch, but wondering if there is any solver in Matlab to estimate two coefficients in the approach ( details below) that I prefer.
Model Equation: ( See the below code for reference)
  • Model equation is constructed using the terms L1, L2, L3, L4, L5, S and At_Af
  • Final form of the model:
At_Af(t+1) = Af*(1 - L1*S((t)+1)
Coefficients to be estimated:
  • Two coefficients - d and Af
Apporach that I wanted to employ:
  • Consider the base line data( curve) array for fitting
  • Evaluate the best fit by stepping 'n' in the equation from 0 to #( how many ever iterations) to obtain the best fit while trying to determine the two coefficients d and Af
  • Can consider d= 1*10^-10 and Af = 0.100826 as the initial guess for the regression.
Data set - attachment:
The initial data set ( baseline curve data) = At_Af (LHS of the model equation) is attached to this query in a spreadsheet
The code of the model equation is below:
L=0.00075; % height of the tissue
% The diffusion equation has been broken down for ease of tranformation
% into code
gama = 0.000167;
L2 = zeros(14,1);
L3 = zeros(100,1);
L4 = zeros(100,1);
L5 = zeros(100,1);
S= zeros(73,1);
At_Af = zeros(73,1);
t1 = 0:1:3000
d= 1*10^-10;
L1 = ((8*gama)/((pi*(1-exp(-2*gama*L)))));
format longE
t_min = t1./60
for t = t1(:).'
for n=0:1:50
L2(n+1) = exp((((2*n + 1)^2)*-d*pi*pi*t)/(4*L*L));
L3(n+1) = (((-1)^n)*2*gama)+(((2*n+1)*pi)*exp(-2*gama*L))/(2*L);
L4(n+1)= ((2*n)+1)*((4*gama*gama)+((((2*n)+1)*pi)/(2*L))^2);
L5(n+1) = ((L2(n+1)*L3(n+1))/L4(n+1));
end
S((t) +1) = sum(L5);
At_Af(t+1) = Af*(1 - L1*S((t)+1));
end
Any help would be greatly appreciated. Thanks in advance!

Accepted Answer

William Rose
William Rose on 22 Sep 2021
I was bothered by the poor performance of fmincon() on this problem. Even when we tried starting at multiple locations, it did not progress reliably o the true minimum. I figured out why: the values of the parameter vector, x=[d,Af], differ by 10 orders of magnitude (d~=1e-10, Af~=1). This large difference in scale of the parameters is a problem for fmincon(). fmincon() works much better if we scale the elements of the vector x to have the same order of magnitude. Threfore I adapted the code so that we specify d*1e10 in the main program. d*1e10 has magnitude~=1, like Af. This is the value passed to fmincon(). We divide this number by 1e10 inside myModel(), to get it back to its "correct" value. With this adjustment, we don't need the "PolyStart" version of the program. You can use fitData.m, which makes only one initial guess. You will see that for a wide rage of initial guesses, it fiinds the same best fit result - which is good. And the best fit it find is a lot better than the "best fit" we found before: the rms error is 0.00000.
I changed the allowable range for Af to [0.5,1.5], compared to the range [0,1] which you specified. I did this because myModelTest.m shows that values of Af in the range [0.5 to 1.5] are reasonable.
The modified versions of fitData.m, fitDataPolyStart.m, and myModel.m, are attached.
Console output and graphical output below, when the initial guess is d0*1e10=.5, Af0=.5:
>> fitData
Best fit at d=1.000e-10, Af=1.0000. R.m.s.error=0.00000.
fitDataPolyStart produces identical reults, after trying 30 different starting guesses.
  14 Comments
William Rose
William Rose on 13 Oct 2021 at 2:26
You're welcome, Anand. You said "Very interesting to see how excel out performs matlab solver. Did you attempt fitting it manually in excel."
I would not say excel outperforms Matlab solver, because they are very different, and they may be compared with various metrics giving different results.
I did try doing the fit manually before remembering thatExcel has an optimization capability.

Sign in to comment.

More Answers (5)

William Rose
William Rose on 20 Sep 2021
fmincon() is a multidimensional minimization routine. By multidimensional I mean it adjusts multiple unknown parameters until a specified function (such as sum of squared errors) is minimized. fmincon() is complicated, but the help page for it is good, and has a bunch of examples. You will want to write a function that implements your diffusion model. You will pass to it the vector x=[d,Af]. It will return the vector AtAf.
function AtAf = myModel(x);
%MYMODEL Function to simulate diffusion
%Input: x=[d,Af];
%Output: AtAf=vector with same length as the experimental data vector
N=99; %length of experimental data vector; replace 99 with correct value
d=x(1);
Af=x(2);
AtAf=d*ones(N,1)+Af; %replace this line with code for the diffusion model
end
Save the code above as file myModel.m.
Write a function to compute the error between the model and the experimental data:
function sse=sumsqerr(x)
%SUMSQERR Function to compute sum squared error between model and experiment
%Input: x=[d,Af];
%Output: sse=scalar
AtAfExp=load('MyExptData.txt'); %read experimental data from text file
AtAfSim=myModel(x);
sse=sum((AtAfExp-AtAfSim).^2); %sum squared error
end
Save code above as file sumsqerr.m.
Write the main program. It calls fmincon(). You pass to fmincon() the initial guess x0=[d0;Af0], and the name of the function to be minimized (sumsqerr), and the allowed bounds for d and Af.
%fitData.m
x0=[1E-10,0.100826]; %initial guess
lb=[0,0]; %lower bound for [x,Af], replace with appropriate values
ub=[1,1]; %upper bound for [x,Af]; replace with appropriate values
x=fmincon(sumsqerr,x0,[],[],[],[],lb,ub);
disp(x) %display value of x=[d,Af] that minimizes the error
Save code above as fitData.m. Run fitData.m.
That gives you the general idea. Obviousy you need to make adjustments in the code above.
  2 Comments

Sign in to comment.


William Rose
William Rose on 21 Sep 2021
Your code was a very good start. There were two fatal errors. ONe is that when you call fmincon(), you must pass a handle to the function, so you write fmincon(x0,@sumsqerr,[],[],[],[],lb,ub). I forgot to include the "@" in my example. The other problem is that function myMOdel() did not assign a value to AtAf. It assigned a value to At_Af. So I fied that. I also simplified your code and I replaced a for n=0:50 loop with vectorized statements to assign values to L2, L3, L4, L5. I checked that my vectorized versions were equivalent to the for loop values before deleting the foor loop code. Another change I made is that I read in the value for AtAfExp from a text file (see attached text file) in the main program, rather than defining it on every call to sumsqerr(). I declare AtAfExp to be global, in the main program and in sumsqerr(), so that its value will be known inside sumsqerr(). This approach may be faster, and more importantly it allows me to plot the eperimental and best fit values of AtAf one the "best fit" is found.
Now let's consider the results. The plot of the eperimental data and the "best fit' is below. It does not look good. I also show y=sqrt(t)/68, which is much better fit than the "best fit" simulation.
I think the "best fit" is not a good fit because fmincon() has gotten stuck in a local minimum that is not a global minimum. This is not uncomon in multidimensinal fitting with fmincon() and similar routines.
What can we do about it? We can manually try different values for the initial guess, x0. Or we can write code that automatically tries different starting points and finds the best fit from each starting point, and then picks the overall best fit. This is the approach I always take when doing multidimensional minimization.

William Rose
William Rose on 21 Sep 2021
Here is a new main program fitDataPolyStart.m, that tries multiple starting points, to reduce the chance of finding a solution that is not a global minimum. Use this instead of fitData.m. The other scripts and functions which I posted have not been changed.
Recal that with fitData, we obtained the "best fit"
d=4.1e-10, Af=0.560, RMSE=0.1093.
My first attempt with the PolyStart code was to try initial guessess that were spread equally across the allowed ranges which you specified. You specified 1e-15<=d<=1e-5, and 0<=Af<=1. Thefore I specified
d0=[1e-15,1e-13,1e-11,1e-9,1e-7,1e-5]; %initial guesses for d
Af0=[0,.2,.4,.6,.8,1]; %initial guesses for Af
This makes 6x6=36 initial guesses. The best of all 36 fits was
d=3.5e-10, Af=0.599, RMSE=0.0878.
That is better but still not very good. Therefore I refined by grid of starting points as follows:
d0=[1e-11,2e-11,5e-11,1e-10,2e-10,5e-10,1e-9]; %initial guesses for d
Af0=[.6,.7,.8,.9,1]; %initial guesses for Af
This makes 7x5=35 initial guesses. The best of all 35 fits was
d=1.8e-10, Af=0.775, RMSE=0.0264.
This is better. Refine the initial guesses again. I think we need a smaller d value and a bigger Af value to improve the fit.
d0=[1.5e-11,2e-11,3e-11,5e-11,7e-11,1e-10,1.5e-10,2e-10]; %initial guesses for d
Af0=[.75,.8,.85,.9,.95,1]; %initial guesses for Af
This best fit of these 56 initial guesses is at
d=1.50e-10, Af=0.850, RMSE=0.0139.
Notice that the RMSE improved by about a factor of 2.
Further guessing produces
d0=[1.0e-10,1.1e-10,1.2e-10]; %initial guesses for d
Af0=[.95,.96,.97]; %initial guesses for Af
which gives a very nice fit:
d=1.10e-10, Af=0.9600, RMSE=0.0026.
I have had good results with fmincon() for various model fitting problems in the past. However, I don't think fmincon() is doing a very good job on this fitting problem. IIt seems to get stuck, or it stops tryng to improve, too soon. fmincon() has many options. Perhpas the performance would improve with different choices for certain options.
  2 Comments
Anand Rathnam
Anand Rathnam on 21 Sep 2021
Hey Will, I think fitDataPolyStart.m file is missing. Can you please attach that when youy get a chance? Thanks a ton!

Sign in to comment.



William Rose
William Rose on 21 Sep 2021
I also wrote the attached script: myModelTest.m. This script calls test myModel to make sure it gives reasonable output, and to see how changes in d and Af affect the output. It does not do a search or a minimization. It just sets the values for d and Af and computes the corresponding AfAt(t). It does it for five different combinations of d and Af. I like to choose a "central" pair of [d,Af]. Then I do higher and lower d, with the same Af; and higher and lower Af, with the same d. That lets me see what Af does to the fit, and what d does to the fit.
The script also reads the experimental data from a file, and it computes the RMS Error of each of the five model outputs, reative to the experimental data.
I learned from this that d is like a rate constant for exponential rise or decay. When d is big, AfAt(t) quickly approaches its asymptotic value. When d is small, AfAt(t) takes a long time to approach its asymptotic value. Af is a scaling factor, and it is the asymptotic value which AfAt(t) approaches, as t gets large.
Here is the console output from the script, and the plot that it generates.
>> myModelTest
i=1, x=1.10e-10,0.900, rmsErr=0.0356
i=2, x=1.10e-10,0.960, rmsErr=0.0026
i=3, x=1.10e-10,1.000, rmsErr=0.0242
i=4, x=8.00e-11,0.960, rmsErr=0.0757
i=5, x=1.00e-09,0.960, rmsErr=0.3787
  2 Comments
William Rose
William Rose on 22 Sep 2021
If you have a new set of experimental data which you want to fit, then save the numbers in a text file, one number per file, with no header, just like file AtAfExpt.txt. If the new file name is AtAfExpt2.txt, then modify fitData.m. Change
AtAfExp=load('AtAfExpt.txt');
to
AtAfExp=load('AtAfExpt2.txt');
I made one change in fitData.m and in fitDataPolyStart.m and myModel.m: I replaced "3001" with "length(AtAfEp)". That shoud assure a correct result, even if the new data file has a different number of numbers in it. That could have been the source of the incompatible size errors. Try the attached files, after you change the file name in fitData.m.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!