Difference between lsqlin (using IP-algorithm) and fmincon (using IP-algorithm)
4 views (last 30 days)
Show older comments
Hello, I'd like to better understand that differences between fmincon and lsqlin and why they can result in dramatically different results. Currently, I have them both set up to solve the same optimization problem and I try to match the settings as closs as possible.
I see lsqlin achieves a better loss. Long term, I don't want nonsmooth results and so while I have used fmincon SQP and a smoothing penalty to achieve better results than lsqlin (although performance greatly depends on initial guess - I suspect my problem may be non-convex!). However, I'm still surprised at just how well Lsqlin is able to perform in comparison to fmincon in terms of loss despite the settings and algorithm seeming to be quite similar - both use the Interior-point algorithm. Also, fmincon results in a very smooth result whereas lsqlin does not. What's the difference between these two? Why are their results so different?
clear; close all; clc;
h = 6.62607E-34;
kb = 1.38065E-23;
c_l = 2.997E8;
%% Generate data %%
TempsK = 200:2:800;
lambda_high = 13E-6;
lambda_low = 3E-6;
% lambda_high = 10E-6;
% lambda_low = 7E-6;
no_L = 500000;
L_test = linspace(lambda_low,lambda_high,no_L*2+1);
wavelengths = (L_test(2:2:end))';
%% Two-gaussian signal
peak_amp = 0.7;
min_em = 0.1;
lambda_peak_1 = 6E-6;
lambda_peak_2 = 9E-6;
sigma_em_1 = .5E-6;
sigma_em_2 = .5E-6;
gauss_term1 = exp(-((wavelengths-lambda_peak_1).^2)./(2.*sigma_em_1.^2));
gauss_term2 = exp(-((wavelengths-lambda_peak_2).^2)./(2.*sigma_em_2.^2));
Spectra_real = peak_amp.*(gauss_term1+gauss_term2) + min_em;
figure(11111)
plot(wavelengths,Spectra_real)
xlabel('\lambda [m]')
ylabel('É›(\lambda)')
set(gca,'FontSize',18)
ylim([0 1])
I_BB = ((2*h*c_l^2./(wavelengths.^5))./(exp(h*c_l./(wavelengths*kb*TempsK))-1))';
%% Fitting parameters
global n_solve I_BB_solve DelV_mat
DelV_mat = I_BB*Spectra_real.*(wavelengths(2)-wavelengths(1));
n_solve = 100;
L_test2 = linspace(lambda_low,lambda_high,n_solve*2+1);
lambda_solve = (L_test2(2:2:end))';
I_BB_solve = ((2*h*c_l^2./(lambda_solve.^5))./(exp(h*c_l./(lambda_solve*kb*TempsK))-1))'.*(lambda_solve(2)-lambda_solve(1));
lowb = zeros(1,n_solve);
upb = ones(1,n_solve);
Signal_real_interp = interp1(wavelengths',Spectra_real',lambda_solve');
Loss_real = loss_emiss_fit_MP(Signal_real_interp);
%% START OPTIMIZATION HERE %%
%% lsqlin Matlab
a_cons = eye(n_solve,n_solve);
b_cons = ones(n_solve,1);
options = optimoptions('lsqlin');
options.Algorithm = 'interior-point';
options.ConstraintTolerance = 1E-8;
options.OptimalityTolerance = 1E-8;
options.StepTolerance= 1E-12;
options.MaxIterations = 1E4;
emiss_solve_lsqlin = lsqlin(I_BB_solve,DelV_mat',a_cons,b_cons,[],[],lowb,upb, [], options);
Loss_LSQ = loss_emiss_fit_MP(emiss_solve_lsqlin');
SSE_Error_LSQ = norm(emiss_solve_lsqlin-Signal_real_interp');
figure(1)
plot(lambda_solve,emiss_solve_lsqlin,'b*')
hold on
plot(wavelengths,Spectra_real,'k-')
title('lsqlin')
legend('fit spectra', 'real spectra', 'location','northwest')
xlabel('\lambda [m]')
ylabel('Signal(\lambda)')
set(gca,'FontSize',18)
%% Fmincon IP Matlab
% Assume we know C
a_cons = eye(n_solve,n_solve);
b_cons = ones(n_solve,1);
options2 = optimoptions('fmincon');
options2.Algorithm = 'interior-point';
options2.ConstraintTolerance = 1E-8;
options2.OptimalityTolerance = 1E-8;
options2.StepTolerance= 1E-12;
options2.MaxIterations = 1E4;
options2.MaxFunctionEvaluations = 1E7;
Parameters_est = 0.5.*ones(1,n_solve);
spectra_solve_IP = fmincon(@loss_emiss_fit_MP, Parameters_est, a_cons,b_cons,[],[],lowb,upb, [], options2);
Loss_IP = loss_emiss_fit_MP(spectra_solve_IP);
SSE_Error_IP = norm(spectra_solve_IP-Signal_real_interp);
figure(2)
plot(lambda_solve,spectra_solve_IP,'b*')
hold on
plot(wavelengths,Spectra_real,'k-')
title('Fmincon IP')
legend('fit spectra', 'real spectra›', 'location','northwest')
xlabel('\lambda [m]')
ylabel('Signal(\lambda)')
set(gca,'FontSize',18)
%% MIDPOINT LOSS FUNCTIONS %%
function out = loss_emiss_fit_MP(P)
global n_solve I_BB_solve DelV_mat
emiss = (P(1:n_solve))';
DelV_gen = I_BB_solve*emiss;
%Non-normalized
out = sum((DelV_gen - DelV_mat).^2);
end
1 Comment
Catalytic
on 9 Sep 2023
Edited: Catalytic
on 9 Sep 2023
Your code presentation is too long and cluttered for us to see confidently that the two optimization formulations are equivalent. (The use of global variables is also a hazardous way of parametrizing your function).
If the point here is just to compare fmincon and lsqlin with the same inputs, you should just put the inputs in a .mat file attachment and run your code from that. We don't need to see how all the variables were created.
Answers (2)
Paras Gupta
on 9 Sep 2023
Hi Jonathan,
I understand you want to know the differences between the ‘lsqlin’ and ‘fmincon’ functions and why they are producing different results when configured similarly.
While ‘fmincon’ and ‘lsqlin’ can both utilize the interior-point algorithm, their differences lie in the type of optimization problems they solve, the constraints they handle, the problem formulation, and the specific algorithmic details implemented in MATLAB's Optimization Toolbox.
Since ‘lsqlin’ is designed specifically for linear least-squares problems (like the problem under consideration), it can often provide more accurate results while sacrificing smoothness. ‘fmincon’ tends to produce smoother results compared to ‘lsqlin’ because it is designed to handle general nonlinear optimization problems and may not be as efficient in solving linear least-squares problems.
You can refer to the following documentations for more information:
- https://www.mathworks.com/help/optim/ug/fmincon.html
- https://www.mathworks.com/help/optim/ug/lsqlin.html
- https://www.mathworks.com/help/optim/ug/least-squares-model-fitting-algorithms.html
- https://www.mathworks.com/help/optim/ug/constrained-nonlinear-optimization-algorithms.html
Hope it helps.
0 Comments
Matt J
on 9 Sep 2023
Edited: Matt J
on 9 Sep 2023
Your linear least squares cost function is highly ill-conditioned. Essentially, I_BB_solve is a singular matrix.
>> cond(I_BB_solve)
ans =
4.1586e+17
This means not only that the solution will be non-unique, but some solvers may have difficulty converging to one.
Furthermore, lsqlin knows how to compute an exact gradient of the cost function, because it knows the optimization problem it is solving always has the form of a linear least squares problem. However, fmincon can't know the form of the cost function gradient (unless you tell it with the SpecifyObjectiveGradient option) and so uses a finite difference approximation. Approximation errors in the gradient computation will make convergence even less reliable.
1 Comment
Matt J
on 9 Sep 2023
Edited: Matt J
on 9 Sep 2023
When I add a user-supplied gradient computation to the fmincon implementation, I find that both algorithms converge with similar resnorms:
load optimdata
upb=min(upb,1);
options = optimoptions('lsqlin');
options.Algorithm = 'interior-point';
options.ConstraintTolerance = 1E-12;
options.OptimalityTolerance = 1E-8;
options.StepTolerance= 1E-12;
options.MaxIterations = 1E7;
[x,resnorm_lsqlin]=lsqlin(I_BB_solve,DelV_mat,[],[],[],[],lowb,upb,[],options);
options2 = optimoptions('fmincon');
options2.Algorithm = 'interior-point';
options2.ConstraintTolerance = 1E-12;
options2.OptimalityTolerance = 1E-8;
options2.StepTolerance= 1E-12;
options2.MaxIterations = 1E4;
options2.MaxFunctionEvaluations = 1E7;
options2.SpecifyObjectiveGradient=true;
fun=@(x) loss_emiss_fit_MP(x,I_BB_solve,DelV_mat);
Parameters_est=I_BB_solve\DelV_mat;
[x,resnorm_fmincon]= fmincon(fun, Parameters_est, [],[],[],[],lowb,upb, [], options2);
resnorm_lsqlin,
resnorm_fmincon,
function [fval,grad]=loss_emiss_fit_MP(x,A,b)
err=A*x-b;
fval=norm(err)^2/2;
if nargout>1
grad=A.'*err;
end
end
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!