You are now following this question
- You will see updates in your followed content feed.
- You may receive emails, depending on your communication preferences.
fmincon: any way to enforce linear inequality constraints at intermediate iterations?
9 views (last 30 days)
Show older comments
SA-W
on 1 Mar 2023
I am solving an optimization problem with the interior-point algorithm of fmincon.
My parameters have both lower and upper bounds and linear inequality constraints.
A_ineq = [1 -1 0 0 0;
-1 2 -1 0 0;
0 -1 2 -1 0;
0 0 -1 2 -1;
0 0 0 1 -1];
b_ineq=[0 0 0 0 0];
I observed that fmincon satisfies, of corrse, the bounds at all iterations but not the linear inequality constraints.
However, the linear inequality constraints are crucially important in my case. If violated, my optimization problem can not be solved.
Is there anything one can do to ensure that linear inequality constraints are satisfied at intermediate iterations?
6 Comments
Walter Roberson
on 2 Mar 2023
fmincon interior-point, sqp, trust-region-reflective satisfy bounds constraints at each iteration, but active-set can violate bounds constraints.
But those refer to bound constraints, not to linear constraints.
SA-W
on 2 Mar 2023
@Walter Roberson
I am aware of that. But I thought that there are maybe some manipulationens to the linear constraints such that fmincon gives them a higher prority at intermediate iterations.
SA-W
on 2 Mar 2023
Scaling the matrix A_ineq by 1e4 or any other large value really helps in my case to (in a least-squares sense) enforce the linear constraints at intermediate iterations.
Matt J
on 6 May 2023
Edited: Matt J
on 6 May 2023
I'd like to point out that it violates the theoretical assumptions of fmincon, and probably all the Optimization Toolbox solvers as well, when the domain of your objective function and constraints is not an open subset of
. If you forbid evaluation outside the closed set defined by your inequality constraints, that is essentially the situation you are creating. It's not entirely clear what hazards this creates in practice, but it might mean one of the Global Optimization Toolbox solvers might be more appropriate.

Accepted Answer
Matt J
on 5 May 2023
Edited: Matt J
on 23 May 2023
Within your objective function, project the current x onto the constrained region using quadprog,
fun=@(x) myObjective(x, A_ineq,b_ineq,lb,ub);
x=fmincon(fun, x0,[],[],[],[],[],lb,ub,options)
function fval=myObjective(x, A_ineq,b_ineq,lb,ub)
C=speye(numel(x));
x=lsqlin(C,x,A_ineq,b_ineq,[],[],lb,ub); %EDIT: was quadprog by mistake
fval=...
end
62 Comments
SA-W
on 6 May 2023
If I understand this snippet correctly, you want to invoke fmincon without passing any constraints and bounds and, at every iteration of fmincon, let quadprog project the x such that all constraints are satisfied?
If so I have two questions:
If constraints were satisfied, quadprog would not change the x, right?
I could also use this strategy with lsqnonlin, right?
Matt J
on 6 May 2023
Edited: Matt J
on 6 May 2023
I changed my mind about passing bounds. You should put those in. Yes, I guess you could do it with lsqnonlin as well.
I'm not sure about the performance of this strategy, as compared to my other answer. It violates differentiability assumptions. Also, because you will be using finite difference derivative approximation, quadprog will need to be called N times per iteration, so it wouldn't be a good strategy for high-dimensional problems.
SA-W
on 7 May 2023
And why would you not pass the linear constraints to fmincon but only to quadprog?
Matt J
on 7 May 2023
It is a quadratic minimization problem

where C is the set defined by the constraints.
SA-W
on 7 May 2023
I know. But why should I not pass the linear constraints to fmincon? I do not see the connection with the subsequent call to quadprog.
Matt J
on 7 May 2023
Edited: Matt J
on 7 May 2023
Ah, well there's no reason fmincon should need to worry about enforcing linear inequality constraints if every test point is going to be projected into a feasible candidate anyway. The objective function is incapable of evaluating anything outside C.
Come to think of it, though, it would probably be wise to apply a penalty on distance from the feasible set as well to remove ambiguity in the solution.
function fval=myObjective(x, A_ineq,b_ineq,lb,ub, weight )
C=speye(numel(x));
[x,dist]=quadprog(C,x,A_ineq,b_ineq,[],[],lb,ub);
fval=...
fval=fval+weight*dist
end
SA-W
on 23 May 2023
I tried your suggested strategy, but it is not working as expected:
function fval=myObjective(x, A_ineq,b_ineq,lb,ub, weight )
%1st iteration, i.e. x is the start vector
x =
0.004000000000000
0.001000000000000
0
0.001000000000000
0.004000000000000
0.009000000000000
0.016000000000000
0.025000000000000
0.036000000000000
0
0.020250000000000
0.063245553203368
0.083495553203368
0.081000000000000
0.144245553203368
0.089442719099992
0.109692719099992
0.170442719099992
all(A_ineq*x) % == 1, i.e. start vector satisfies constraints
C=speye(numel(x));
[x,dist]=quadprog(C,x,A_ineq,b_ineq,[],[],lb,ub);
x =
1.0e-03 *
0.178567411241192
0.050029327861103
0
0.000014275626508
0.000030351196081
0.000053745685964
0.000087040217878
0.000137579529376
0.000245546982784
0
0.000003261476952
0.000013844333853
0.000023204570239
0.000023729908401
0.000054189214599
0.000019086149826
0.000029710389433
0.000063797849365
% not possible to work with the new x, which is several magnitudes lower
end
Calling quadprog with the start vector, which satisfies all constraints and bounds, results in a new vector x, where nearly all components are several orders of magnitudes lower.
I attached the matrix A_ineq, b_ineq=0, lb=2, ub=0. Basically, A_ineq encodes that parameters are less or greater than neighboring parameters.
Ideally, if constraints were satisfied, I would expect quadprog to not change x significantly.
Does it make sense to you how the projected x looks like in my case?
SA-W
on 23 May 2023
Update:
The issue seems to be not related to the linear constraints but to the bounds.
C=speye(numel(x));
[x,dist]=quadprog(C,x,[],[],[],[],lb,ub);
x =
1.0e-03 *
0.001101980588430
0.160659227282455
0
0.160659227282455
0.001101980588430
0.000065909263473
0.000037138227992
0.000023769741153
0.000016507062926
0
0.000029344843098
0.000009396094623
0.000007117296309
0.000007336573768
0.000004119807053
0.000006644060468
0.000005417528065
0.000003486590875
lb =
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
ub =
1
1
0
1
1
1
1
1
1
0
2
2
2
2
2
2
2
2
Calling quadprog without bounds and linear constraints, I get the the vector that I pass multiplied by (-1).
Do you have any idea as to why the bounds are causing this issue?
Torsten
on 23 May 2023
Edited: Torsten
on 23 May 2023
You want to find a vector x that minimizes
1/2*(x-x_passed).'*(x-x_passed)
under the constraints
A_ineq*x <= b_ineq
where x_passed is the vector you get from fmincon.
Thus f in the list of inputs to quadprog must be -x_passed, not x_passed:
[x,dist]=quadprog(C,-x,A_ineq,b_ineq,[],[],lb,ub);
instead of
[x,dist]=quadprog(C,x,A_ineq,b_ineq,[],[],lb,ub);
Matt J
on 23 May 2023
@SA-W Sorry, I made a mistake. Instead of quadprog, I meant for you to use lsqlin:
fun=@(x) myObjective(x, A_ineq,b_ineq,lb,ub);
x=fmincon(fun, x0,[],[],[],[],[],lb,ub,options)
function fval=myObjective(x, A_ineq,b_ineq,lb,ub)
C=speye(numel(x));
x=lsqlin(C,x,A_ineq,b_ineq,[],[],lb,ub);
fval=...
end
But I want to reiterate that I think this whole thing you are pursuing is probably very inefficient. You should just abort the objective function with inf, as I suggest in my other answer.
SA-W
on 23 May 2023
Thanks to both of you. Makes sense to pass -x to quadprog.
@Matt J I have an analytical expression for the Jacobian. Hence, quadprog of lsqlin have to be called only once per iteration of lsqnonlin/fmincon. That said, I think the overall strategy is not so inefficient.
Anyway, why would you use lsqlin instead of quadprog?
Matt J
on 23 May 2023
Edited: Matt J
on 24 May 2023
@SA-W In the code I originally posted, quadprog was being called with the input argument syntax of lsqlin, so that probably explains why it didn't work. lsqlin is more specialized than quadprog, so I tend to think it is better to use that when it is applicable.
Regarding your analytical gradient, I am not sure it is valid anymore. If your original objective function was f(x) with gradient g(x), the new one with the projection onto the linear contraints is f(P(x)) where P is my notation for the projection operator. The new gradient would be
*g(P(x)), but I don't immediately see any easy way to compute
.


EDIT: Possibly, you can take
as the linear projection operator (transposed) onto the active linear constraints.

SA-W
on 23 May 2023
"The new gradient would be *g(P(x)), but I don't immediately see any easy way to compute nabla P "
I see what you mean. If I were using finite differencing at fmincon/lsqnonlin level, I would not have to worry about that, right?
" EDIT: Possibly, you can take nabla P as the linear projection operator (transposed) onto the active linear constraints. "
Does this help in calculating nabla P?
Matt J
on 23 May 2023
I see what you mean. If I were using finite differencing at fmincon/lsqnonlin level, I would not have to worry about that, right?
Right, but it would slow you down a lot.
Does this help in calculating nabla P?
Yes, projecting onto the active constraints is a simple matrix multiplication.
SA-W
on 23 May 2023
"Yes, projecting onto the active constraints is a simple matrix multiplication."
Could you please explain a little bit how that could be implemented?
SA-W
on 23 May 2023
Do the expressions look the same for inequailities instead of equalities, i.e. if the set is given by
set {x|A*x <= b} instead of set {x|A*x=b} ?
Matt J
on 23 May 2023
Edited: Matt J
on 23 May 2023
No, but the case {x|A*x <= b} doesn't apply here. You will be projecting onto the active constraints. Active constraints are the ones satisfied with equality at the solution lsqlin gives you..
SA-W
on 23 May 2023
Say A*x > 0 at some components with x given by fmincon. After invoking lsqlin, I observed that A*x=0 at those components. Is this the expected behavior?
SA-W
on 23 May 2023
"I don't know what you mean by "with x given by fmincon". In the approach we are talking about, fmincon can does not act independently of lsqlin. You are calling lsqlin inside the objective function, which is in turn called iteratively by fmincon."
Yes, that's clear. Within the objective function, say some constraints are violated, i.e. A_ineq*x > 0 for some components. Then I call lsqlin to project x onto the feasible set and what I observe then is that A_ineq*x = 0 at those components, i.e. those constraints are active now.
Does it make sense that the constraints are active after calling lsqlin?
Torsten
on 23 May 2023
They can be active or inactive after calling "lsqlin". The purpose of the call to "lsqlin" is to make the x from "fmincon" feasible, and feasible means: a_i*x = b_i or a_i*x < b_i.
SA-W
on 24 May 2023
Suppose a_i*x > b_i before calling lsqlin. After calling lsqlin, is this constraint active or inactive, or is there no general statement?
Matt J
on 24 May 2023
Edited: Matt J
on 24 May 2023
The general statement is this - an inequality constraint a_i*x <= b_i can either be active, inactive, or violated at a given point x. In ideal mathematical terms, these cases are as follows:
a_i*x < b_i (inactive)
a_i*x = b_i (active)
a_i*x > b_i (violated)
An equality constraint will always be active, when satisfied. Otherwise, it is violated.
The optimization strategy we are pursuing here requires that you find which constraints are active at the projected point generated by lsqlin. As you do this, you have to be mindful that you are working with a finite precision computer, and so you have to modify the ideal notion of what it means for a constraint to be active. You must instead consider a constraint active if,
abs( a_i*x - b_i ) < tol
where tol>=0 is a tolerance parameter to be chosen by you.
SA-W
on 24 May 2023
The new gradient would be *g(P(x)), but I don't immediately see any easy way to compute nabla P .
EDIT: Possibly, you can take nabla P as the linear projection operator (transposed) onto the active linear constraints.
So, in the equation
nablaP =(I-pinv(A)*A).';
the matrix A contains only rows associated with active constraints?
Torsten
on 24 May 2023
Suppose a_i*x > b_i before calling lsqlin. After calling lsqlin, is this constraint active or inactive, or is there no general statement?
No general statement is possible in my opinion.
SA-W
on 13 Feb 2025
Edited: SA-W
on 13 Feb 2025
You suggested to project onto active constraints using lsqlin in every call to the objective function using
x = lsqlin(C,x,A_ineq,b_ineq,[],[],lb,ub);
The projection onto the active set and its gradient are then given as

In the objective function, we can then return

where g(x) is my notation for the gradient my objective function would return without any lsqlin projection.
First of all, I do not think the equation for
is correct if we pass the bounds (lb and ub) to lsqlin. Consider the toy problem in 1d (ub = 0, lb = 1) where the "projection function" is already not-differentiable. Correct me if I am wrong, but when passing the bounds to lsqlin, there is no hope for an analytical
.


But thats ok, I can formulate my problem without bounds. More important for me is the case where some components of x are fixed values (i.e.
for some i). For this, lets partition
and
where
refers to the fixed (constrained) values and
to the free values, Ais the linear inequality matrix. Then the projection problem reduces to finding the free components:






So the overall
being used for the gradient of the objective
has a 2x2 block structure



(i) Is this mathematical derivation correct? It looks a bit suspicious to me to have this zero block at the top right, but intutitively it makes sense (the fixed values do not change if the free values are changed...)
(ii) Below is a minimal example demonstrating the projection step and the extraction of the active constraints at the projected solution. Looks this reasonable?
The reason why I show this is that the gradient of my objective function (probably the
part) is not valid anymore what I verify against a FD approximation. Without the projection, the gradient check works. So I might have a logical mistake in the code below, maybe also in the way I am extracting the active constraints and construct the final nabla P.

% Points and mock values x
pt = [3 5 7 9 11]; % Shape 5x1
x = [0 0.046506608245001 0.096706757662298 0.153809034551104 0.205280568411077];
% Compute linear constraints
k = 4; % cubic spline f
B = spapi(k, pt, eye(numel(pt))); % function basis
Bd = fnder(B, 1); % first derivative basis
Bdd = fnder(B, 2); % second derivative basis
% Implement [f >= 0; fd >= 0, fdd >=0] in Aineq x <= b
A = [-B.coefs'; -Bd.coefs'; -Bdd.coefs']; % Shape 12x5
b = zeros(size(A, 1), 1); % Shape 12x1
% Assume x = [x_c, x_f] where x_c = x(1) = 0
xc = 0; % Shape 1x1
xf = x(2:end); % Shape 4x1
Ac = A(:, 1); % Shape 12x1
Af = A(:, 2:end); % Shape 12x4
% Projection onto active set A_f x_f = b - A_c x_c
C = eye(numel(xf));
zf = lsqlin(C, xf, Af, b);
% Find active constraints at projected solution
idx_active = find(abs(Af*zf) < 1e-6); % idx = [1, 12], i.e., two ctr are active
Af_active = Af(idx_active,:);
nablaP = eye(size(Af_active, 2)) - pinv(Af_active) * Af_active;
% Concatenated solution z and gradient nabla P
z = vertcat(0, zf);
nablaP = blkdiag(0, nablaP); % Create 2x2 block shape
nablaP(2:end, 1) = -pinv(Af_active) * Ac(idx_active); % Set (2,1) block
nablaP(1, 1) = 1.0; % Set (1,1) block
Matt J
on 13 Feb 2025
Edited: Matt J
on 13 Feb 2025
Correct me if I am wrong, but when passing the bounds to lsqlin, there is no hope for an analytical.
I don't see why the presence of bounds would invalidate anything, as long as you are treating the bounds the same way you are treating the other inequality constraints. Remember, bounds are a special case of linear inequality constraints, when Aineq is the identity matrix eye(N)*x<=ub, -eye(N)*x<=-lb. So, you also need to look at which bounds are active when you form the projection operator pinv(A).
More important for me is the case where some components of x are fixed values.
You haven't explained how you are implementing that. Ideally, fixed and apriori known values wouldn't be among the x(i) at all - because they're not unknowns. In that case, they wouldn't have any role to play in the projection.
If you are including them in the unknowns, it would have to mean that you have set lb(i)=ub(i)=c(i) If that's the case, the only change that should be needed is to treat these as bound constraints that are always active, as mentioned above.
The reason why I show this is that the gradient of my objective function (probably the part) is not valid anymore what I verify against a FD approximation. Without the projection, the gradient check works.
You haven't shown us the numbers, but there is no reason to expect agreement with an FD check anymore, or at least not everywhere. The function is now only piecewise differentiable. At points where the set of active constraints change (such as at the boundary of the linearly constrained region) there will be a discontinuity in the derivative.
SA-W
on 13 Feb 2025
Edited: SA-W
on 13 Feb 2025
You haven't shown us the numbers, but there is no reason to expect agreement with an FD check anymore, or at least not everywhere. The function is now only piecewise differentiable.
This makes sense. I did not keep that in mind. However, I managed to create a toy problem which does not converge to the expected solution.
Specifically, I am creating reference data (xq --> yq = xq.^2) and want to fit a quadratic spline to this data. The objective function is ||f(xq) - yq||^2, where f(xq) represents the sampled spline values. Since the spline is quadratic, it can perfectly fit the data, which is reproduced by the code.

Then, I implemented the linear constraints f''>=0 and the projection as discussed herein and fmincon converged to a wrong solution:

You can reproduce the output with this code. Do you see a mistake in the implementation?
% Generate data
xq = linspace(3, 11, 30);
yq = xq.*xq;
% Quadratic spline f and basis for df/dy
k = 3;
x = [3 5 7 9 11];
B = spapi(k, x, eye(numel(x)));
% Linear constraints
Bdd = fnder(B, 2);
Aineq = -Bdd.coefs'; % Shape 3x5
bineq = zeros(size(Aineq, 1), 1); % Shape 5x1
% Function handle and solver options
objectiveFunc = @(y) Objective_func(y, k, x, Aineq, bineq, xq, yq);
options = optimoptions("fmincon", ...
"FiniteDifferenceStepSize", 1e-6, ...
"SpecifyObjectiveGradient", true);
% Optimizer (no constraints)
lb = []; ub = []; A = []; b = [];
x0 = -x .*x; % x0 activates the constraint
sol = fmincon(objectiveFunc, x0, lb, ub, A, b, [], [], [], options);
% Plot data and solution
f = spapi(k, x, sol);
fnplt(f); hold on;
plot(xq, yq, 'r+', 'DisplayName', 'data');
legend show;
function [fval, grad] = Objective_func(y, k, x, Aineq, bineq, xq, yq)
% y are the current parameters
% Project y onto active set
C = eye(numel(y));
options = optimset('Display', 'off');
y = lsqlin(C, y(:), Aineq, bineq, [], [], [], [], [], options);
% Find active ctr
idx_active = find(abs(Aineq * y(:)) < 1e-6);
Aineq_active = Aineq(idx_active,:); %#ok<FNDSB>
nablaP = eye(numel(y)) - pinv(Aineq_active) * Aineq_active;
% Compute residual and sum of squares
f = spapi(k, x, y);
residual = fnval(f, xq) - yq;
fval = sum(residual.*residual);
% Compute jacobian and gradient
if nargout > 1
B = spapi(k, x, eye(numel(x)));
jacobianT = fnval(B, xq);
grad = 2 * jacobianT * residual(:);
grad = nablaP' * grad;
end
end
Matt J
on 13 Feb 2025
For a quadratic spline, you need k=2:
% Generate data
xq = linspace(3, 11, 30);
yq = xq.*xq;
% Quadratic spline f and basis for df/dy
k = 2;
x = [3 5 7 9 11];
B = spapi(k, x, eye(numel(x)));
% Linear constraints
Bdd = fnder(B, 2);
Aineq = -Bdd.coefs'; % Shape 3x5
bineq = zeros(size(Aineq, 1), 1); % Shape 5x1
% Function handle and solver options
objectiveFunc = @(y) Objective_func(y, k, x, Aineq, bineq, xq, yq);
options = optimoptions("fmincon", ...
"FiniteDifferenceStepSize", 1e-6, ...
"SpecifyObjectiveGradient", true);
% Optimizer (no constraints)
lb = []; ub = []; A = []; b = [];
x0 = -x .*x; % x0 activates the constraint
sol = fmincon(objectiveFunc, x0, lb, ub, A, b, [], [], [], options);
Local minimum found that satisfies the constraints.
Optimization completed because the objective function is non-decreasing in
feasible directions, to within the value of the optimality tolerance,
and constraints are satisfied to within the value of the constraint tolerance.
% Plot data and solution
f = spapi(k, x, sol);
fnplt(f); hold on;
plot(xq, yq, 'r+', 'DisplayName', 'data');
legend show;

SA-W
on 13 Feb 2025
Edited: SA-W
on 13 Feb 2025
For a quadratic spline, you need k=2:
The curve you plotted is a piece-wise linear function. For the spapi function, k refers to the order of the spline which is the degree +1. So we need k=3 for a quadratic spline (degree 2).
So for k=3, can you fit the data as I showed in the first picture of my previous post? As I said, when the projection terms are included, fmincon converges not to the solution.
Matt J
on 14 Feb 2025
Edited: Matt J
on 28 Feb 2025
% Generate data
xdata = linspace(3, 11, 30); ydata = xdata.^2;
% Quadratic spline f and basis for df/dy
k = 3;
xcp = [3 5 7 9 11]; %control point x-locations
fcn = @(z) fnval(spapi(k, xcp, z),xdata);
tic
S=func2mat(fcn,xcp,'doSparse',false);
toc
Elapsed time is 0.055264 seconds.
dS=diff(S,1,1);
% Linear constraints
Aineq = -dS;
bineq = zeros(height(dS), 1);
% Function handle and solver options
objectiveFunc = @(z) Objective_func(z,Aineq, bineq, ydata,S);
options = optimoptions("fmincon", ...
"FiniteDifferenceStepSize", 1e-6, ...
"SpecifyObjectiveGradient", true);
% Optimizer (no constraints)
lb = []; ub = []; A = []; b = [];
z0 = -xcp(:); %
violation = any(Aineq*z0>bineq) %Initial guess z0 violates the constraint
violation = logical
1
z = fmincon(objectiveFunc, z0, A, b, [], [],lb,ub, [], options);
Local minimum found that satisfies the constraints.
Optimization completed because the objective function is non-decreasing in
feasible directions, to within the value of the optimality tolerance,
and constraints are satisfied to within the value of the constraint tolerance.
% Plot data and solution
close all
xFit=linspace( xdata(1),xdata(end),1e4);
yFit=fnval(spapi(k, xcp, z), xFit );
plot(xFit,yFit,'-b',xdata,ydata, 'ro'); hold off
legend('Fitted Spline Samples','Data',Location='northwest'); hold off

function [fval, grad] = Objective_func(z,Aineq, bineq, ydata,S)
% y are the current parameters
z=z(:); ydata=ydata(:); I = speye(numel(z));
% Project z onto constraining set
options = optimoptions('lsqlin','Display', 'none');
zp = lsqlin(I, z, Aineq, bineq, [], [], [], [], [], options);
errData=S*zp-ydata; %data error term (depends only on zp)
errReg=zp-z; %penalty term on constraint violation
% Compute objective
fval=norm(errData)^2/2 + norm(errReg)^2/2;
% Compute jacobian and gradient
if nargout > 1
idx_active = abs(Aineq * zp - bineq) < 1e-6;
A = Aineq(idx_active,:);
JacP = I - pinv(A) * A;
grad=(S*JacP).'*errData + (JacP-I).'*errReg;
end
end
SA-W
on 14 Feb 2025
Edited: SA-W
on 14 Feb 2025
Thanks for the working example.
fcn = @(z) fnval(spapi(k, xcp, z),xdata);
In my real problem, the evaluation points (xdata) are different for every fmincon iteration. So probably, I needed to setup (fcn) and (Aineq) anew inside the objective function.
Anyway, the minimal example code I provided is closer to my real code base. Also the logic of computing the constraints as (Aineq = -Bdd.coefs'). So it were important for me to understand why the code below converges to the wrong solution.
What needs to be changed below? The gradient at the solution is zero, so I suspect something must be wrong with the gradient, but I can not see a mistake.
% Generate data
xq = linspace(3, 11, 30);
yq = xq.*xq;
% Quadratic spline f and basis for df/dy
k = 3;
x = [3 5 7 9 11];
B = spapi(k, x, eye(numel(x)));
% Linear constraints
Bdd = fnder(B, 2);
Aineq = -Bdd.coefs'; % Shape 3x5
bineq = zeros(size(Aineq, 1), 1); % Shape 5x1
% Function handle and solver options
objectiveFunc = @(y) Objective_func(y, k, x, Aineq, bineq, xq, yq);
options = optimoptions("fmincon", ...
"FiniteDifferenceStepSize", 1e-6, ...
"SpecifyObjectiveGradient", true);
% Optimizer (no constraints)
lb = []; ub = []; A = []; b = [];
x0 = -x .*x; % x0 activates the constraint
sol = fmincon(objectiveFunc, x0, lb, ub, A, b, [], [], [], options);
Local minimum found that satisfies the constraints.
Optimization completed because the objective function is non-decreasing in
feasible directions, to within the value of the optimality tolerance,
and constraints are satisfied to within the value of the constraint tolerance.
% Plot data and solution
f = spapi(k, x, sol);
fnplt(f); hold on;
plot(xq, yq, 'r+', 'DisplayName', 'data');
legend show;

function [fval, grad] = Objective_func(y, k, x, Aineq, bineq, xq, yq)
% y are the current parameters
% Project y onto active set
C = eye(numel(y));
options = optimset('Display', 'off');
y = lsqlin(C, y(:), Aineq, bineq, [], [], [], [], [], options);
% Find active ctr
idx_active = find(abs(Aineq * y(:)) < 1e-6);
Aineq_active = Aineq(idx_active,:); %#ok<FNDSB>
nablaP = eye(numel(y)) - pinv(Aineq_active) * Aineq_active;
% Compute residual and sum of squares
f = spapi(k, x, y);
residual = fnval(f, xq) - yq;
fval = sum(residual.*residual);
% Compute jacobian and gradient
if nargout > 1
B = spapi(k, x, eye(numel(x)));
jacobianT = fnval(B, xq);
grad = 2 * jacobianT * residual(:);
grad = nablaP' * grad;
end
end
Matt J
on 14 Feb 2025
Edited: Matt J
on 14 Feb 2025
The main problem seems to be that you are not adding a penalty term on the projection residual, as I advised in this earlier comment. This creates lots of local minima. Your code does converge as is when a better initial point is provided, e.g., x0 =x.^2-1. By adding a penalty term, as below, it also converges from x0=-x.^2:
% Generate data
xq = linspace(3, 11, 30);
yq = xq.*xq;
% Quadratic spline f and basis for df/dy
k = 3;
x = [3 5 7 9 11];
B = spapi(k, x, eye(numel(x)));
% Linear constraints
Bdd = fnder(B, 2);
Aineq = -Bdd.coefs'; % Shape 3x5
bineq = zeros(size(Aineq, 1), 1); % Shape 5x1
% Function handle and solver options
objectiveFunc = @(y) Objective_func(y, k, x, Aineq, bineq, xq, yq);
options = optimoptions("fmincon", ...
"FiniteDifferenceStepSize", 1e-6, ...
"SpecifyObjectiveGradient", true);
% Optimizer (no constraints)
lb = []; ub = []; A = []; b = [];
x0 = -x .*x; % x0 activates the constraint
sol = fmincon(objectiveFunc, x0, lb, ub, A, b, [], [], [], options);
Local minimum found that satisfies the constraints.
Optimization completed because the objective function is non-decreasing in
feasible directions, to within the value of the optimality tolerance,
and constraints are satisfied to within the value of the constraint tolerance.
% Plot data and solution
f = spapi(k, x, sol);
fnplt(f); hold on;
plot(xq, yq, 'r+', 'DisplayName', 'data');
legend show;

function [fval, grad] = Objective_func(z, k, x, Aineq, bineq, xq, yq)
% y are the current parameters
z=z(:); xq=xq(:); yq=yq(:); I = eye(numel(z));
% Project z onto linearly constrained region
options = optimset('Display', 'off');
zp = lsqlin(I, z, Aineq, bineq, [], [], [], [], [], options);
% Compute norms of residuals
f = spapi(k, x, zp);
dataResidual = fnval(f, xq) - yq;
projectionResidual=zp-z;
fval = norm(dataResidual).^2/2 + norm(projectionResidual)^2/2;
% Compute jacobian and gradient
if nargout > 1
% Find active ctr
idx_active = abs(Aineq * zp - bineq) < 1e-6;
A = Aineq(idx_active,:);
JacP = I- pinv(A) * A;
B = spapi(k, x, eye(numel(x)));
jacobianT = fnval(B, xq);
grad = (jacobianT.' * JacP).'*dataResidual(:) + (JacP-I).'*projectionResidual;
end
end
SA-W
on 14 Feb 2025
The main problem seems to be that you are not adding a penalty term on the projection residual, as I advised in this earlier comment. This creates lots of local minima.
Ah, I was not aware that neglecting the constraint violation would lead to multiple local minima since the problem without projection is quadratic.
Below, I added the constraint x(1) = 0 and I implemented it by setting lb(1) = ub(1) = 0 and pass the bounds to both fmincon and the lsqlin solver. Keeping all parameters (also the fixed ones) is easier in my interfaces. You said ...
So, you also need to look at which bounds are active when you form the projection operator pinv(A).
In this example, lb(1) and ub(1) are active per design. But how can we include this when forming JacP?
% Find active ctr
idx_active = abs(Aineq * zp - bineq) < 1e-6;
A = Aineq(idx_active,:); % Active linear constraints, How to include active lb(1) and ub(1) ?
JacP = I- pinv(A) * A;
One idea I have (based on your comment) is to add the unit vectors ([-1 0 0 0 0] and [1 0 0 0 0]) to Aineq ...
lb1 = [-1 0 0 0 0]; ub1 = [1 0 0 0 0];
A = [lb1; ub1; A];
% Find active ctr
idx_active = abs(Aineq * zp - bineq) < 1e-6; % Now always includes indices "1" and "2"
A = Aineq(idx_active,:);
JacP = I- pinv(A) * A;
but I feel this is not the most efficient way to do this.
Code with x(1) = lb(1) = ub(1) = 0:
% Generate data
xq = linspace(3, 11, 30);
yq = xq.*xq;
% Quadratic spline f and basis for df/dy
k = 3;
x = [3 5 7 9 11];
B = spapi(k, x, eye(numel(x)));
% Linear constraints
Bdd = fnder(B, 2);
Aineq = -Bdd.coefs'; % Shape 3x5
bineq = zeros(size(Aineq, 1), 1); % Shape 5x1
% Bound constraints to fix x(1) = 0
lb = -Inf(numel(x), 1);
ub = +Inf(numel(x), 1);
lb(1) = 0.0;
ub(1) = 0.0;
% Function handle and solver options
objectiveFunc = @(y) Objective_func(y, k, x, lb, ub, Aineq, bineq, xq, yq);
options = optimoptions("fmincon", ...
"FiniteDifferenceStepSize", 1e-6, ...
"SpecifyObjectiveGradient", true);
% Call optimizer
x0 = -x .*x; % x0 activates the constraint
sol = fmincon(objectiveFunc, x0, [], [], [], [], lb, ub, [], options);
% Plot data and solution
f = spapi(k, x, sol);
fnplt(f); hold on;
plot(xq, yq, 'r+', 'DisplayName', 'data');
legend show;
function [fval, grad] = Objective_func(z, k, x, lb, ub, Aineq, bineq, xq, yq)
% y are the current parameters
z=z(:); xq=xq(:); yq=yq(:); I = eye(numel(z));
% Project z onto linearly constrained region
options = optimset('Display', 'off');
zp = lsqlin(I, z, Aineq, bineq, [], [], lb, ub, [], options);
% Compute norms of residuals
f = spapi(k, x, zp);
dataResidual = fnval(f, xq) - yq;
projectionResidual=zp-z;
fval = norm(dataResidual).^2/2 + norm(projectionResidual)^2/2;
% Compute jacobian and gradient
if nargout > 1
% Find active ctr
idx_active = abs(Aineq * zp - bineq) < 1e-6;
A = Aineq(idx_active,:);
JacP = I- pinv(A) * A;
B = spapi(k, x, eye(numel(x)));
jacobianT = fnval(B, xq);
grad = (jacobianT.' * JacP).'*dataResidual(:) + (JacP-I).'*projectionResidual;
end
end
Matt J
on 14 Feb 2025
Edited: Matt J
on 14 Feb 2025
One idea I have (based on your comment) is to add the unit vectors ([-1 0 0 0 0] and [1 0 0 0 0]) to Aineq ...
That's more or less what I would do. If the dimension is large, you might be able to improve efficiency using sparse operations:
function A = getActiveConstraints(zp,Aineq,bineq,Aeq,beq,lb,ub)
arguments
zp (:,1);
Aineq,bineq,Aeq,beq;
lb (:,1);
ub (:,1);
end
I=speye(numel(zp));
A=cell(4,1);
if ~isempty(Aineq)
idx=abs(Aineq * zp - bineq) < 1e-6;
A{1}=Aineq(idx,:);
end
% if ~isempty(Aeq)
% idx=abs(Aeq * zp - beq) < 1e-6;
% A{2}=Aeq(idx,:);
% end
A{2}=Aeq; %If lsqlin was successfully, equality constraints will always be active
if ~isempty(ub)
idx=abs(zp - ub) < 1e-6;
A{3}=full(I(idx,:));
end
if ~isempty(lb)
idx=abs(zp - lb) < 1e-6;
A{4}=full(-I(idx,:));
end
A=vertcat(A{:});
end
SA-W
on 14 Feb 2025
Edited: SA-W
on 14 Feb 2025
Thank you. Maybe a problem-specific detail:
if ~isempty(Aineq)
idx=abs(Aineq * zp - bineq) < 1e-6;
A{1}=Aineq(idx,:);
end
How to set the tolerance (here 1e-6) to identify active constraints? In general, my splines have a small co-domain (say the interpolation values are in [0, 1]). Hence, the entries of Aineq which enocodes constraints such as convexity or positivity are also rather small.
Example: Say Aineq has two rows and Aineq*zp produces [1e-5, 1e-7]. With a tolerance of 1e-6, we would only identify the first constraint as active. How does the neglection of constraints close to our specified tolerance affects nablaP?
Regardless of the scale of the parameters, I think it is not so easy to find a threshold which separates active constraints from non-active ones at the projected point.
Matt J
on 14 Feb 2025
Edited: Matt J
on 28 Feb 2025
Hence, the entries of Aineq which enocodes constraints such as convexity or positivity are also rather small.
Remember that the matrix rows defining a constraint are unique only up to a positive scalar multiple. Accordingly, I usually normalize my constraints so that the rows always have norm =1, e.g.,
s=vecnorm(Aineq,2,2);
Aineq=Aineq./s;
bineq=bineq./s;
This is the same as saying that the hyperplanes that bound your linearly constrained region are represented with unit normals. Once normalized, though, the selection of a threshold tolerance is a subjective thing. You will just have to experiment with what gives good results.
Matt J
on 14 Feb 2025
This is the same as saying that the hyperplanes that bound your linearly constrained region are represented with unit normals.
And consequently, constraints normalized this way will mean that the condition,
abs(Aineq(i,:) * zp - bineq(i))==d
implies that zp is exactly a distance d in L2-norm from the constraining hyperplane Aineq(i,:)*x =bineq(i). I don't know if that helps you choose your tolerance, but maybe it will.
SA-W
on 15 Feb 2025
I don't know if that helps you choose your tolerance, but maybe it will.
Not sure how it might help, but normalizing the rows of Aineq is in either case a good idea if different constraint sets (e.g. f>=0 and f''>=0) are assembled into Aineq to weight the different sets equally. Makes sense to formulate it like that?
Another hyperparameter is the weighting factor associated with the projection residual:
zp = lsqlin(I, z, Aineq, bineq, [], [], [], [], [], options);
projRes = zp-z;
fval = fval + weight * norm(projRes).^2
Usually, this weight is chosen on a case to case basis and has the purpose to balance/weight different contributions to the residuals.
In our case, the value we assign to "weight" does not really matter as the solution is characterized by "procRes = 0". So it may affect the path taken to the solution, but it should affect the solution itself, right?
Matt J
on 15 Feb 2025
but it should [Ed. not] affect the solution itself, right?
I dont think it should, no.
SA-W
on 17 Feb 2025
Before calculating pinv(A)*A, is it necessary that the A has unique rows? For intstance: When I implement x(1) = lb(1) = ub(1) = 0 and f > 0, then I may end up with two rows representing the same unit vector.
Walter Roberson
on 17 Feb 2025
The thing about pinv() is that it works even with singular matrices.
A = [1 0 0 0
0 1 0 0
1 0 0 0
0 0 0 1]
A = 4×4
1 0 0 0
0 1 0 0
1 0 0 0
0 0 0 1
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
pinv(A)*A
ans = 4×4
1.0000 0 0 0
0 1.0000 0 0
0 0 0 0
0 0 0 1.0000
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
SA-W
on 17 Feb 2025
In this code, I create a quadratic spline with linear data and implement f''>=0. So all constraints are active. My expectation was that such a point will (numerically) not get modified by lsqlin. However, the difference (zp - z) is in the order of the scale of my data (approx. 1e-3). As a consequence, lsqlin will pull me away from the grount truth solution.
How would you account for the scale of the data here?
I tried multiplying Aineq with a large number before passing to lsqlin which reduced the difference (zp - z), but I am not sure if this generalizes well.
format long;
% Spline
k = 3;
x = [3 5 7 9 11];
z = 1e-3 * (x-3);
disp(z(:)');
0 0.002000000000000 0.004000000000000 0.006000000000000 0.008000000000000
B = spapi(k, x, eye(numel(x)));
% Linear constraints
Bdd = fnder(B, 2);
Aineq = -Bdd.coefs'; % Shape 3x5
Aineq = Aineq ./ vecnorm(Aineq, 2, 2);
bineq = zeros(size(Aineq, 1), 1); % Shape 5x1
% Project
lb = zeros(5, 1);
ub = Inf(5, 1); ub(1) = lb(1);
options = optimset('Display', 'off');
zp = lsqlin(eye(numel(x)), z, Aineq, bineq, [], [], lb, ub, [], options);
disp(zp(:)');
0 0.001968432719120 0.003955808406666 0.005963348117333 0.008057533188573
Matt J
on 17 Feb 2025
It looks like it's just the tolerance settings,
format long;
% Spline
k = 3;
x = [3 5 7 9 11];
z = 1e-3 * (x-3);
disp(z(:)');
0 0.002000000000000 0.004000000000000 0.006000000000000 0.008000000000000
B = spapi(k, x, eye(numel(x)));
% Linear constraints
Bdd = fnder(B, 2);
Aineq = -Bdd.coefs'; % Shape 3x5
Aineq = Aineq ./ vecnorm(Aineq, 2, 2);
bineq = zeros(size(Aineq, 1), 1); % Shape 5x1
% Project
lb = zeros(5, 1);
ub = Inf(5, 1); ub(1) = lb(1);
tol=1e-16;
options = optimoptions('lsqlin','Display', 'off',...
'OptimalityTol',tol,'FunctionTol',tol,'StepTol',tol);
zp = lsqlin(eye(numel(x)), z, Aineq, bineq, [], [], lb, ub, [], options);
disp(z(:)-zp(:));
1.0e-08 *
0
0.380189064865671
0.527687552768358
0.429635448398469
-0.681117628964500
SA-W
on 18 Feb 2025
Maybe I can also ask you for your opinion about why I want to make this projection locally in the objective function rather than just passing the linear constraints directly fo fmincon once. What I want to do is optimizing the spline interpolation values subject to the linear constraints. However, the challenge in my method is that I not have a good guess for the upper bound x(end) -- the last interpolation point -- when starting the optimization. A good value for x(end) depends on the final interpolation values which is why I want to dynamically adjust x(end) during optimization.
In every iteration, I can compute an indicator which tells me what a good x(end) would be for the next iteration: Example (numel(x) = 3): In iteration 0, I start with x = [0, 1, 2], in iteration 2, x = [0 0.5 1] would be appropriate, and so on. In other words, the interpolation points may change across iterations (so does Aineq), but the number of parameters can be kept constant by shrinking/stretching the interpolation domain. I may not change the discretization in every iteration, but only if required to ensure that all interpolation values are sampled in the objective function. Suppose I initialized in iteration 0 with a linear spline (y = [0 1 2]) and the the spline was evaluated at points xq in [0, 2]. Then in iteration 1, y may be quadratic (y = [0 1 4]) and xq in [0 1] only, such that x = [0 0.5 1] is a better interpolation domain for the next iteration and the initial x = [0, 1, 2] is not appripriate anymore since no meaningful value was computed for the gradients at points x > 1.
I know that disturbing the optimizer while searching for the optimum is not wiseful and I am not even sure if what I described is possible at all. Maybe I am barking up the wrong tree and you have a different suggestion to handle this. This local projection gives at least the flexiblity to work with different Aineq's which I could not do by passing Aineq just once at the fmincon interface.
Matt J
on 18 Feb 2025
Edited: Matt J
on 18 Feb 2025
Maybe I can also ask you for your opinion about why I want to make this projection locally in the objective function rather than just passing the linear constraints directly fo fmincon once
Because if you do that, fmincon will not confine itself to the constrained region at all iterations. This all began because you said that you could not tolerate that. It was the title of your post.
If that is no longer the issue, it probably would be better as a new post.
SA-W
on 19 Feb 2025
SA-W
on 26 Feb 2025
I dont think it should, no.
I did some more tests on the projection minimal example code. In particular, solving the optimproblem with different weights assigned to the projection residual. For the given start vector, I was only able to identify ground truth with an insane large weight of 1e12. Also, regardless of the weight, most of the solver steps were cg-steps (and no direct steps), thereby slowing down the identification.
Do you think it is possible to design the penalty term, such that the method is more robust against such examples?
% Reference spline
x = linspace(3, 36, 12);
y = [0.000000000000000
0.000081435666435
0.000651485331480
0.002051866299466
0.004821212148785
0.011026195105394
0.018337300345558
0.025720823074519
0.033104345803481
0.040487868532442
0.047971929673952
0.056059221290756];
k = 4;
fref = spapi(k, x, y);
% Generate data
xq = linspace(3, 36, 10);
yq = fnval(fref, xq);
% Linear constraints
B = spapi(k, x, eye(numel(x)));
Bdd = fnder(B, 2);
Aineq = [-B.coefs'; -Bdd.coefs'];
bineq = zeros(size(Aineq, 1), 1);
% Bound constraints to fix x(1) = 0
lb = -Inf(numel(x), 1);
ub = +Inf(numel(x), 1);
lb(1) = 0.0;
ub(1) = 0.0;
% Solver options
options = optimoptions("fmincon", ...
"FiniteDifferenceStepSize", 1e-6, ...
"SpecifyObjectiveGradient", true, ...
"StepTolerance", 1e-12, ...
"Display", "off");
% Start point
x0 = -(x - x(1));
% Weights for constraint violation: weight*||zp - z||^2
weights = [1e0, 1e4, 1e6, 1e8, 1e10, 1e12];
figure;
% Loop through weights, solve problem, and plot solution
for w = 1:numel(weights)
% Function handle
objectiveFunc = @(y) Objective_func(y, k, x, lb, ub, Aineq, bineq, weights(w), xq, yq);
% Call optimizer
[sol, ~, exitFlag, output, ~, ~, ~] = fmincon(objectiveFunc, x0, [], [], [], [], lb, ub, [], options);
fprintf('output for weight %g\n\n', weights(w));
disp(output);
% Plot data and solution
subplot(2, 3, w);
f = spapi(k, x, sol);
fnplt(f); hold on;
plot(xq, yq, 'r+');
title(sprintf('weight = %g', weights(w)))
hold off
end
output for weight 1
iterations: 1
funcCount: 2
constrviolation: 0
stepsize: 67.4833
algorithm: 'interior-point'
firstorderopt: 1.8790e-15
cgiterations: 0
message: 'Local minimum found that satisfies the constraints....'
bestfeasible: [1x1 struct]
output for weight 10000
iterations: 2
funcCount: 7
constrviolation: 0
stepsize: 64.0192
algorithm: 'interior-point'
firstorderopt: 2.9576e-17
cgiterations: 1
message: 'Local minimum found that satisfies the constraints....'
bestfeasible: [1x1 struct]
output for weight 1e+06
iterations: 7
funcCount: 53
constrviolation: 0
stepsize: 8.9050e-08
algorithm: 'interior-point'
firstorderopt: 2.9576e-17
cgiterations: 53
message: 'Local minimum found that satisfies the constraints....'
bestfeasible: [1x1 struct]
output for weight 1e+08
iterations: 29
funcCount: 188
constrviolation: 0
stepsize: 0.0263
algorithm: 'interior-point'
firstorderopt: 3.2633e-14
cgiterations: 213
message: 'Local minimum found that satisfies the constraints....'
bestfeasible: [1x1 struct]
output for weight 1e+10
iterations: 66
funcCount: 427
constrviolation: 0
stepsize: 1.6005e-06
algorithm: 'interior-point'
firstorderopt: 7.5808e-07
cgiterations: 766
message: 'Local minimum found that satisfies the constraints....'
bestfeasible: [1x1 struct]
output for weight 1e+12
iterations: 61
funcCount: 403
constrviolation: 0
stepsize: 1.2615e-06
algorithm: 'interior-point'
firstorderopt: 5.5037e-07
cgiterations: 363
message: 'Local minimum found that satisfies the constraints....'
bestfeasible: [1x1 struct]

function [fval, grad] = Objective_func(z, k, x, lb, ub, Aineq, bineq, weight, xq, yq)
% y are the current parameters
z=z(:); xq=xq(:); yq=yq(:); I = eye(numel(z));
% Project z onto linearly constrained region
tol = 1e-12;
options = optimoptions('lsqlin', 'Algorithm', 'active-set', 'Display', 'off', ...
FunctionTolerance=tol, StepTolerance=tol, OptimalityTolerance=tol);
zp = lsqlin(I, z, Aineq, bineq, [], [], lb, ub, z, options);
% Compute norms of residuals
f = spapi(k, x, zp);
dataResidual = fnval(f, xq) - yq;
projectionResidual=zp-z;
fval = norm(dataResidual).^2/2 + weight * norm(projectionResidual)^2/2;
% Compute jacobian and gradient
if nargout > 1
% Find active ctr
lb1 = [-1 0 0 0 0 0 0 0 0 0 0 0];
ub1 = [1 0 0 0 0 0 0 0 0 0 0 0];
Aineq = [lb1; ub1; Aineq];
bineq = [0; 0; bineq];
idx_active = abs(Aineq * zp - bineq) < 1e-6;
A = Aineq(idx_active,:);
JacP = I - pinv(A) * A;
B = spapi(k, x, eye(numel(x)));
jacobianT = fnval(B, xq);
grad = (jacobianT.' * JacP).'*dataResidual(:) + weight * (JacP-I).'*projectionResidual;
end
end
Matt J
on 28 Feb 2025
You forgot to pass the Aineq, bineq constraints to fmincon:
load data
% Weights for constraint violation: weight*||zp - z||^2
weights = 1;
% Loop through weights, solve problem, and plot solution
for w = 1:numel(weights)
% Function handle
objectiveFunc = @(y) Objective_func(y, k, x, lb, ub, Aineq, bineq, weights(w), xq, yq);
% Call optimizer
[sol, ~, exitFlag, output, ~, ~, ~] = fmincon(objectiveFunc, x0, Aineq, bineq, [], [], lb, ub, [], options);
fprintf('output for weight %g\n\n', weights(w));
disp(output);
% Plot data and solution
subplot(2, 3, w);
f = spapi(k, x, sol);
fnplt(f); hold on;
plot(xq, yq, 'r+');
title(sprintf('weight = %g', weights(w)))
hold off
end
output for weight 1
iterations: 26
funcCount: 86
constrviolation: 4.2161e-20
stepsize: 1.9288e-12
algorithm: 'interior-point'
firstorderopt: 0.0026
cgiterations: 42
message: 'Local minimum possible. Constraints satisfied....'
bestfeasible: [1x1 struct]

function [fval, grad] = Objective_func(z, k, x, lb, ub, Aineq, bineq, weight, xq, yq)
% y are the current parameters
z=z(:); xq=xq(:); yq=yq(:); I = eye(numel(z));
% Project z onto linearly constrained region
tol = 1e-12;
options = optimoptions('lsqlin', 'Algorithm', 'active-set', 'Display', 'off', ...
FunctionTolerance=tol, StepTolerance=tol, OptimalityTolerance=tol);
[zp,~,~,exitflag] = lsqlin(I, z, Aineq, bineq, [], [], lb, ub, z, options);
if exitflag<=0
keyboard
end
% Compute norms of residuals
f = spapi(k, x, zp);
dataResidual = fnval(f, xq) - yq;
projectionResidual=zp-z;
fval = norm(dataResidual).^2/2 + weight * norm(projectionResidual)^2/2;
% Compute jacobian and gradient
if nargout > 1
% Find active ctr
lb1 = [-1 0 0 0 0 0 0 0 0 0 0 0];
ub1 = [1 0 0 0 0 0 0 0 0 0 0 0];
Aineq = [lb1; ub1; Aineq];
bineq = [0; 0; bineq];
idx_active = abs(Aineq * zp - bineq) < 1e-6;
A = Aineq(idx_active,:);
JacP = I - pinv(A) * A;
B = spapi(k, x, eye(numel(x)));
jacobianT = fnval(B, xq);
grad = (jacobianT.' * JacP).'*dataResidual(:) + weight * (JacP-I).'*projectionResidual;
end
end
SA-W
on 1 Mar 2025
But still the first order optimality is relatively large (0.0026) and many cg iterations were necessary. I would have expected that this toy problem gives the reliable exitFlag=1, indicating a minium with grad=0 was found.
So is it to be expected that the additional projection onto linearly constrained region can cause the optimizer to not find the minimum even for simply problems such as this one?
Matt J
on 1 Mar 2025
I suspect that the reason first order optimality doesn't reach a lower value is that the objective is not differentiable at the solution. Here is a simplified idea in 2D of what the objective function surface looks like. The non-differentiable points are at the edges of the linearly constrained region (and elsewhere). In this example, that region is the simple box bounded by [-1, 1, -1, 1]. In your case, the solution will always be at the edge of the constrained region because you are constraining x(1)=0.
weight=1;
p=@(r) clip(r,-1,1); %project onto feasible set - a box
f=@(x,y) norm(p([x,y])-1.5).^2 + weight*norm(p([x,y])-[x,y]).^2;
fsurf(f,[-1, 1, -1, 1]*3); view(-150,35)
Warning: Function behaves unexpectedly on array inputs. To improve performance, properly vectorize your function to return an output with the same size and shape as the input arguments.

Matt J
on 2 Mar 2025
One remedy appears to be to slightly relax your bound constraints e.g. lb(1)=0, ub(1)=1e-8 so that the solution can be approached from the interior,
clear; close all; clc,
% Reference spline
x = linspace(3, 36, 12);
y = [0.000000000000000
0.000081435666435
0.000651485331480
0.002051866299466
0.004821212148785
0.011026195105394
0.018337300345558
0.025720823074519
0.033104345803481
0.040487868532442
0.047971929673952
0.056059221290756];
k = 4;
fref = spapi(k, x, y);
% Generate data
xq = linspace(3, 36, 10);
yq = fnval(fref, xq);
% Linear constraints
N=numel(x);
B = spapi(k, x, eye(N));
Bdd = fnder(B, 2);
Aineq = [-B.coefs'; -Bdd.coefs'];
bineq = zeros(size(Aineq, 1), 1);
%Aeq=[1,zeros(1,N-1)]; beq=0;
Aeq=[];beq=[];
% Bound constraints to fix x(1) = 0
lb = -Inf(N, 1); ub = +Inf(N, 1); lb(1) = 0.0; ub(1) = 1e-8;
%lb=[]; ub=[];
% Solver options
options = optimoptions("fmincon", "Algorithm","interior-point",...
"FiniteDifferenceStepSize", 1e-6, ...
"SpecifyObjectiveGradient", true, "OptimalityTolerance",1e-12,'FunctionTolerance',1e-6,...
"StepTolerance", 1e-12, ...
"Display", "off");
% Start point
x0=-(x-x(1));
% Weights for constraint violation: weight*||zp - z||^2
weights = [1,10,100,1000,1e4,1e6];
figure;
% Loop through weights, solve problem, and plot solution
for w = 1:numel(weights)
% Function handle
objectiveFunc = @(y) Objective_func(y, k, x, Aineq, bineq, Aeq,beq, lb, ub, weights(w), xq, yq);
% Call optimizer
[sol, ~, exitFlag, output, ~, ~, ~] = fmincon(objectiveFunc, x0, Aineq, bineq, Aeq,beq, lb, ub, [], options);
fprintf('output for weight %g\n\n', weights(w));
disp(output);
% Plot data and solution
subplot(2, 3, w);
f = spapi(k, x, sol);
fnplt(f); hold on;
plot(xq, yq, 'r+');
title(sprintf('weight = %g', weights(w)))
hold off
end
output for weight 1
iterations: 49
funcCount: 58
constrviolation: 0
stepsize: 9.9081e-17
algorithm: 'interior-point'
firstorderopt: 5.3626e-10
cgiterations: 0
message: 'Local minimum possible. Constraints satisfied....'
bestfeasible: [1x1 struct]
output for weight 10
iterations: 87
funcCount: 98
constrviolation: 0
stepsize: 4.4570e-17
algorithm: 'interior-point'
firstorderopt: 1.1038e-10
cgiterations: 1
message: 'Local minimum possible. Constraints satisfied....'
bestfeasible: [1x1 struct]
output for weight 100
iterations: 112
funcCount: 121
constrviolation: 0
stepsize: 1.0470e-16
algorithm: 'interior-point'
firstorderopt: 4.2409e-10
cgiterations: 0
message: 'Local minimum possible. Constraints satisfied....'
bestfeasible: [1x1 struct]
output for weight 1000
iterations: 121
funcCount: 151
constrviolation: 0
stepsize: 3.4116e-17
algorithm: 'interior-point'
firstorderopt: 8.2982e-11
cgiterations: 8
message: 'Local minimum possible. Constraints satisfied....'
bestfeasible: [1x1 struct]
output for weight 10000
iterations: 122
funcCount: 152
constrviolation: 0
stepsize: 1.6683e-12
algorithm: 'interior-point'
firstorderopt: 9.0796e-10
cgiterations: 34
message: 'Local minimum possible. Constraints satisfied....'
bestfeasible: [1x1 struct]
output for weight 1e+06
iterations: 167
funcCount: 273
constrviolation: 0
stepsize: 4.3343e-13
algorithm: 'interior-point'
firstorderopt: 3.2239e-07
cgiterations: 123
message: 'Local minimum possible. Constraints satisfied....'
bestfeasible: [1x1 struct]

function [fval, grad] = Objective_func(z, k, x, Aineq, bineq, Aeq,beq, lb, ub, weight, xq, yq)
% y are the current parameters
z=z(:); xq=xq(:); yq=yq(:); I = eye(numel(z));
% Project z onto linearly constrained region
tol = 1e-12;
options = optimoptions('lsqlin','Display', 'off', ...
FunctionTolerance=tol, StepTolerance=tol, OptimalityTolerance=tol);
[zp,~,~,exitflag] = lsqlin(I, z, Aineq, bineq, Aeq,beq, lb, ub, z, options);
% if exitflag<=0
% keyboard
% end
% Compute norms of residuals
f = spapi(k, x, zp);
dataResidual = fnval(f, xq) - yq;
projectionResidual=zp-z;
fval = norm(dataResidual).^2/2 + weight * norm(projectionResidual)^2/2;
% Compute jacobian and gradient
if nargout > 1
% Find active ctr
A = getActiveConstraints(zp,Aineq,bineq,Aeq,beq,lb,ub);
JacP = I - pinv(A) * A;
B = spapi(k, x, eye(numel(x)));
jacobianT = fnval(B, xq);
grad = (jacobianT.' * JacP).'*dataResidual(:) + weight * (JacP-I).'*projectionResidual;
end
end
function A = getActiveConstraints(zp,Aineq,bineq,Aeq,beq,lb,ub)
arguments
zp (:,1);
Aineq,bineq,Aeq,beq;
lb (:,1);
ub (:,1);
end
I=speye(numel(zp));
A=cell(4,1);
if ~isempty(Aineq)
idx=abs(Aineq * zp - bineq) < 1e-6;
A{1}=Aineq(idx,:);
end
% if ~isempty(Aeq)
% idx=abs(Aeq * zp - beq) < 1e-6;
% A{2}=Aeq(idx,:);
% end
A{2}=Aeq; %If lsqlin was successfully, equality constraints will always be active
if ~isempty(ub)
idx=abs(zp - ub) < 1e-6;
A{3}=full(I(idx,:));
end
if ~isempty(lb)
idx=abs(zp - lb) < 1e-6;
A{4}=full(-I(idx,:));
end
A=vertcat(A{:});
end
SA-W
on 2 Mar 2025
Although it may worked here by relaxing ub(1) = 1e-8 (instead of ub(1) = lb(1)), but in general, if I expect many constraints to be active (for instance, also the linear constraints f'' = 0), then this strategy may not be the best since the solver may get stuck at non-differentiable points.
So the strategy in your other answer, namely to return Inf when constraints are violated, is more robust, right?
Matt J
on 3 Mar 2025
Edited: Matt J
on 3 Mar 2025
It appears below that, as long as the interior-point method is used, you can have active constraints at the solution. Here we impose x(1)=0 using Aeq,beq instead of bounds.
However, it also appears that you must exclude equality constraints as arguments from fmincon,
fmincon(objectiveFunc, x0, Aineq,bineq, [],[], lb,ub, [], options)
(but still enforce them through lsqlin). I'm a bit foggy on why this is. It's obviously very specific to the interior-point algorithm, because things work very reliably with it, and less reliably when you switch to another algorithm, e.g sqp.
clear; close all; clc,
Aineq = [-B.coefs'; -Bdd.coefs'];
bineq = zeros(size(Aineq, 1), 1);
Aeq=[1,zeros(1,N-1)]; beq=0;
%Aeq=[];beq=[];
% Bound constraints to fix x(1) = 0
lb = -Inf(N, 1); ub = +Inf(N, 1); lb(1) = -1; ub(1) = +1;
%lb=[]; ub=[];
% Solver options
options = optimoptions("fmincon", "Algorithm","interior-point",...
"FiniteDifferenceStepSize", 1e-6, ...
"SpecifyObjectiveGradient", true, "OptimalityTolerance",1e-12,'FunctionTolerance',1e-12,...
"StepTolerance", 1e-12, ...
"Display", "off");
% Start point
x0=-(x-x(1));
% Weights for constraint violation: weight*||zp - z||^2
weights = [1,10,100,1000,1e4,1e6];
figure;
% Loop through weights, solve problem, and plot solution
for w = 1:numel(weights)
% Function handle
objectiveFunc = @(y) Objective_func(y, k, x, Aineq, bineq, Aeq,beq, lb, ub, weights(w), xq, yq);
% Call optimizer
[sol, ~, exitFlag, output, ~, ~, ~] = fmincon(objectiveFunc, x0, Aineq,bineq, [],[], lb,ub, [], options);
fprintf('output for weight %g\n\n', weights(w));
disp(output);
% Plot data and solution
subplot(2, 3, w);
f = spapi(k, x, sol);
fnplt(f); hold on;
plot(xq, yq, 'r+');
title(sprintf('weight = %g', weights(w)))
hold off
end
output for weight 1
iterations: 40
funcCount: 45
constrviolation: 0
stepsize: 1.0095e-16
algorithm: 'interior-point'
firstorderopt: 1.2801e-08
cgiterations: 0
message: 'Local minimum possible. Constraints satisfied....'
bestfeasible: [1x1 struct]
output for weight 10
iterations: 93
funcCount: 98
constrviolation: 0
stepsize: 1.1408e-16
algorithm: 'interior-point'
firstorderopt: 2.0523e-09
cgiterations: 0
message: 'Local minimum possible. Constraints satisfied....'
bestfeasible: [1x1 struct]
output for weight 100
iterations: 112
funcCount: 117
constrviolation: 0
stepsize: 5.7230e-17
algorithm: 'interior-point'
firstorderopt: 7.0198e-09
cgiterations: 0
message: 'Local minimum possible. Constraints satisfied....'
bestfeasible: [1x1 struct]
output for weight 1000
iterations: 119
funcCount: 131
constrviolation: 0
stepsize: 8.1106e-17
algorithm: 'interior-point'
firstorderopt: 9.3932e-11
cgiterations: 9
message: 'Local minimum possible. Constraints satisfied....'
bestfeasible: [1x1 struct]
output for weight 10000
iterations: 117
funcCount: 126
constrviolation: 0
stepsize: 5.3344e-17
algorithm: 'interior-point'
firstorderopt: 1.6398e-07
cgiterations: 10
message: 'Local minimum possible. Constraints satisfied....'
bestfeasible: [1x1 struct]
output for weight 1e+06
iterations: 150
funcCount: 197
constrviolation: 0
stepsize: 1.2570e-12
algorithm: 'interior-point'
firstorderopt: 1.0282e-04
cgiterations: 59
message: 'Local minimum possible. Constraints satisfied....'
bestfeasible: [1x1 struct]

function [fval, grad] = Objective_func(z, k, x, Aineq, bineq, Aeq,beq, lb, ub, weight, xq, yq)
% y are the current parameters
z=z(:); xq=xq(:); yq=yq(:); I = eye(numel(z));
% Project z onto linearly constrained region
tol = 1e-12;
options = optimoptions('lsqlin','Display', 'off', ...
FunctionTolerance=tol, StepTolerance=tol, OptimalityTolerance=tol);
[zp,~,~,exitflag] = lsqlin(I, z, Aineq, bineq, Aeq,beq, lb, ub, z, options);
% if exitflag<=0
% keyboard
% end
% Compute norms of residuals
f = spapi(k, x, zp);
dataResidual = fnval(f, xq) - yq;
projectionResidual=zp-z;
fval = norm(dataResidual).^2/2 + weight * norm(projectionResidual)^2/2;
% Compute jacobian and gradient
if nargout > 1
% Find active ctr
A = getActiveConstraints(zp,Aineq,bineq,Aeq,beq,lb,ub);
JacP = I - pinv(A) * A;
B = spapi(k, x, eye(numel(x)));
jacobianT = fnval(B, xq);
grad = (jacobianT.' * JacP).'*dataResidual(:) + weight * (JacP-I).'*projectionResidual;
end
end
function A = getActiveConstraints(zp,Aineq,bineq,Aeq,beq,lb,ub)
arguments
zp (:,1);
Aineq,bineq,Aeq,beq;
lb (:,1);
ub (:,1);
end
I=speye(numel(zp));
A=cell(4,1);
if ~isempty(Aineq)
idx=abs(Aineq * zp - bineq) < 1e-6;
A{1}=Aineq(idx,:);
end
% if ~isempty(Aeq)
% idx=abs(Aeq * zp - beq) < 1e-6;
% A{2}=Aeq(idx,:);
% end
A{2}=Aeq; %If lsqlin was successfully, equality constraints will always be active
if ~isempty(ub)
idx=abs(zp - ub) < 1e-6;
A{3}=full(I(idx,:));
end
if ~isempty(lb)
idx=abs(zp - lb) < 1e-6;
A{4}=full(-I(idx,:));
end
A=vertcat(A{:});
end
More Answers (2)
Shubham
on 5 May 2023
Hi SA-W,
Yes, there are a few options you can try to ensure that the linear inequality constraints are satisfied at intermediate iterations of the interior-point algorithm in fmincon:
- Tighten the tolerances: By tightening the tolerances in the fmincon options, you can force the algorithm to take smaller steps and converge more slowly, but with higher accuracy. This may help to ensure that the linear inequality constraints are satisfied at all intermediate iterations.
- Use a barrier function: You can try using a barrier function to penalize violations of the linear inequality constraints. This can be done by adding a term to the objective function that grows very large as the constraints are violated. This will encourage the algorithm to stay within the feasible region defined by the constraints.
- Use a penalty function: Similar to a barrier function, a penalty function can be used to penalize violations of the linear inequality constraints. However, instead of growing very large, the penalty function grows linearly with the degree of violation. This can be a more computationally efficient approach than a barrier function.
- Use a combination of methods: You can try using a combination of the above methods to ensure that the linear inequality constraints are satisfied at all intermediate iterations. For example, you could tighten the tolerances and use a penalty function or barrier function to further enforce the constraints.
It's important to note that these methods may increase the computational cost of the optimization problem, so it's important to balance the accuracy requirements with the available resources.
Matt J
on 5 May 2023
Within your objective function, check if the linear inequalites are satsfied. If they are not, abort the function and return Inf. Otherwise, compute the output value as usual.
fun=@(x) myObjective(x, A_ineq,b_ineq);
x=fmincon(fun, x0,A_ineq,b_ineq,[],[],lb,ub,[],options)
function fval=myObjective(x, A_ineq,b_ineq)
if ~all(A_ineq*x<=b_ineq)
fval=Inf; return
end
fval=...
end
See Also
Categories
Find more on Linear Least Squares in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!An Error Occurred
Unable to complete the action because of changes made to the page. Reload the page to see its updated state.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom(English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)