fmincon user-supplied hessian inconsistency

I'm using fmincon to optimize a scalar function with this abridged set of options:
options = optimset('Display', 'iter-detailed','MaxFunEvals',10000,...
'Maxiter',1000,'TolX',1e-18,'TolFun',1e-18, 'GradObj', 'on',...
'Algorithm', 'trust-region-reflective', 'Hessian',...
'user-supplied');
The following is the call to fmincon, objective function, and constraint functi n). Note that all are nested functions allowing the passing of extra parameters and unnecessary details have been left out.
[X,fval, exitflag, output, lambda, grad, hessian] = fmincon(@nestedfcn, min, [], [], [], [], lb, [], @cfcn, options);
function [c, ceq] = cfcn(min)
for i = 1:length(min)
if i ~= 9
c(i) = -25000 + abs(min(i));
else
c(i) = -75000 + abs(min(i));
end
end
ceq = [];
end
function [y, grady, H] = nestedfcn(min)
a = F ./ min;
[u,sig,vol] = truss2d(nnod,nel,e,a,conn,x,bc,f);
y = vol;
for i = 1:length(min)
grady(i) = -F(i)*L / (sig(i)^2);
for j=1:length(min)
if i==j
H(i,j) = F(i)*L / (sig(i)^3);
else
H(i,j) = 0;
end
end
end
end
The answer does not come out as it is supposed to and the Hessian is, at every point, a diagonal matrix, while fmincon outputs a non-diagonal matrix. For brevity, I've left out the specific matrices, but if it helps to arrive at a solution, I can post those as well (I think the particular values are irrelevant at the moment as I think it is an implementation issue).

Answers (1)

Matt J
Matt J on 11 Jul 2013
Edited: Matt J on 11 Jul 2013
Well, I for one, can't see why FMINCON would output a diagonal Hessian from what you've posted. The reason for that would have to be in details that you've omitted. Note, however, that the matrix returned by FMINCON will not be the Hessian purely of your objective function. As explained in the documentation, it will be the Hessian of the entire Lagrangian (i.e., it includes the Hessian of your constraints). So, if there are any other constraints you've omitted in an effort to simplify presentation, that could be an explanation.
Other than that, I see one red flag issue. The use of expressions abs(min(i)) in cfcn makes your constraints non-differentiable, violating smoothness assumptions of FMINCON. If you weren't aware of differentiability requirements, I assume you could have violated them in truss2d() as well. This could explain why you're not converging to the required point, especially (but not necessarily) if some of the desired min(i) lie near zero.
If the lb(i) you haven't shown are all >=0 then you can replace abs(min(i)) with just min(i) and that would solve the issue. A better solution though would be to rewrite the constraints you've shown as ub,lb bounds. For example
c(i) = -25000 + abs(min(i));
is equivalent to -25000<=min(i)<=25000 and you can use ub(i),lb(i) to express this constraint instead, merging with any lb that you already have.

4 Comments

Thank you for your help. Well, I now understand that I am in error in regards to the diagonal nature to the Hessian. Additionally, for
'Algorithm', 'trust-region-reflective',
this cannot be used with the given constraints (or so I've been told by MATLAB), but
'Algorithm', 'active-set',
is used instead, which does not use an analytical Hessian. So, it really didn't matter what Hessian I supplied because it was never used and my question was a silly one. Regardless, I went ahead and followed through on your suggestion because I felt it was a good one.
Unfortunately, after having changed the constraints to ensure differentiability, the answer remains the same. And the output vector is effectively treating the inequality constraints as equality constraints and once they have been met, the code terminates. For example, the optimal min vector as outputted is:
min = [25000 25000 -25000 -25000 25000 25000 25000 -25000 75000 -25000]
when it should be,
min = [25000 25000 -25000 -25000 -0.007 25000 25000 -25000 37500 -25000].
So it seems that the code does not optimize any further once all of the constraints have been met, but the function has yet to be minimized. This even holds true if I use this as the input values for an additional optimization session, i.e., the input is unchanged by the output.
I suppose that is a different question altogether. Once again, thanks for your help. I might have to post a different question since the original question is no longer the topic of this post.
Well, there are a few tests I can recommend. You have two candidate solutions, one of which you think is the correct one and one of which fmincon thinks is the correct one.
minPhil = [25000 25000 -25000 -25000 -0.007 25000 25000 -25000 37500 -25000]
minFmincon = [25000 25000 -25000 -25000 25000 25000 25000 -25000 75000 -25000]
You should evaluate the objective function at both candidates (and make sure they both satisfy constraints to within TolCon). If minFmincon has the lower value, you will know that fmincon is correct, and you have an error either in your expectations or in the code you have provided to it. Otherwise, it means that minFmincon is a sub-optimal local minimum, and you simply chose a bad initial guess.
The second test you should do is to run fmincon again, this time with minPhil as the initial guess. If fmincon terminates without changing minPhil (much), you will verify that minPhil is at least a local minimum. Otherwise, your code or expectations are again in doubt.
Thanks again. I have done what you had suggested with no success in narrowing down the problem. The problem I am solving is well documented with regards to the solution and I am using the traditional initial guess.
When running fmincon using minFmincon as the initial guess, there is little to no change in the output, which may indicate a local minimum. However, fmincon, in compliance with the Kuhn-Tucker (KKT) conditions, would have the gradient of the Lagrangian equal to zero (<http://www.mathworks.com/help/optim/ug/first-order-optimality-measure.html#brh0y76>), which, as you pointed out earlier, is the gradient that should either be supplied by the user or is calculated using finite differences, but my final outputted gradient is not equal to zero. Curious, unless the gradient outputted by fmincon is no longer the gradient of the lagrangian in order to take into account the constraints.
Regardless of the non-zero gradient, my guess is that I have an issue with updating intermediate parameters used within the nestedfcn, i.e., the inputs to truss2d.
Thank you for all of your help, you have definitely clarified many of my issues and errors. While the problem is not solved yet, I don't believe that the remaining error is in the implementation of the built in MATLAB functions (thanks to you) and I will rethink my intermediate updates and rewrite accordingly. Thanks again.
Matt J
Matt J on 17 Jul 2013
Edited: Matt J on 17 Jul 2013
but my final outputted gradient is not equal to zero.
I'm not sure whether the gradient output is the gradient of the objective function or of the Lagrangian. You can experiment with a few simple problems to check. Either way, even at an unconstrained minimum, the gradient wouldn't be exactly zero. There are other stopping criteria besides the first order optimality measure that FMINCON uses to decide whether to stop iterating. There's TolX, TolFun, etc... It certainly won't wait until you land smack dab on top of an optimal solution.

Sign in to comment.

Products

Asked:

on 11 Jul 2013

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!