different results from fmincon with nonlcon
Show older comments
Hi, my problem is that an optimization with fmincon (see below) is not stable, i.e. various runs deliver various results. The optimization problem is described as below:
Many thanks in advance for your help!
Best Wishes, Alex
The function to be optimized:
fun=@(x)x*meanOrRegressed
The non-linear constraint (Sum of all positive weights not above 1.3 and sum of all negative weights not below -0.3:
function [c,ceq,gradc,gradceq]=nonlcon_g(x,1.3,0.3)
nrAssets=size(x,1);
weightSumMax = sum(max(0,x(1:nrAssets-1)));
c(1) = weightSumMax-1.3;
weightSumMin = sum(min(0, x(1:nrAssets-1)));
c(2) = -weightSumMin-0.3;
gradc1=1/2*(sign(x(1:nrAssets-1))+ones(nrAssets-1,1));
gradc1=vertcat(gradc1,0);
gradc2=1/2*(sign(x(1:nrAssets-1))-ones(nrAssets-1,1));
gradc2=vertcat(gradc2,0);
gradc=horzcat(gradc1,gradc2);
ceq =[];
gradceq=[];
end
[weights,optimizedValue,exitflag] = fmincon(@fun,initialWeights',A,b,Aeq,beq,lowerBound,upperBound,@nonlcon_g,options)
A = -1 -1 -1 -1 -1 0
1 1 1 1 1 0
0 0 0 0 0 -1
0 0 0 0 0 1
b = 0.3 1.3 0.3 1.3
Aeq = 1 1 1 1 1 1
beq = 1
lowerBound = -0.3 -0.3 -0.3 -0.3 -0.3 -0.3
upperBound = 1.3 1.3 1.3 1.3 1.3 1.3 1 3
meanOrRegressed =
2.349096891796729e-004
-7.582259013820250e-005
1.190461785891006e-003
2.529756213317396e-003
1.066862350689632e-003
5.133561643835617e-005
options =
Display: []
MaxFunEvals: 60000
MaxIter: 40000
TolFun: 1.000000000000000e-010
TolX: []
FunValCheck: []
OutputFcn: []
PlotFcns: []
ActiveConstrTol: []
Algorithm: 'active-set'
AlwaysHonorConstraints: []
BranchStrategy: []
DerivativeCheck: []
Diagnostics: []
DiffMaxChange: []
DiffMinChange: []
FinDiffRelStep: []
FinDiffType: 'forward'
GoalsExactAchieve: []
GradConstr: 'on'
GradObj: []
HessFcn: []
Hessian: []
HessMult: []
HessPattern: []
HessUpdate: []
InitialHessType: []
InitialHessMatrix: []
InitBarrierParam: []
InitTrustRegionRadius: []
Jacobian: []
JacobMult: []
JacobPattern: []
LargeScale: []
LineSearchType: []
MaxNodes: []
MaxPCGIter: []
MaxProjCGIter: []
MaxRLPIter: []
MaxSQPIter: 200000
MaxTime: []
MeritFunction: []
MinAbsMax: []
NodeDisplayInterval: []
NodeSearchStrategy: []
NoStopIfFlatInfeas: []
ObjectiveLimit: []
PhaseOneTotalScaling: []
Preconditioner: []
PrecondBandWidth: []
RelLineSrchBnd: []
RelLineSrchBndDuration: []
ScaleProblem: []
Simplex: []
SubproblemAlgorithm: []
TolCon: 1.000000000000000e-008
TolConSQP: []
TolGradCon: []
TolPCG: []
TolProjCG: []
TolProjCGAbs: []
TolRLPFun: []
TolXInteger: []
TypicalX: []
UseParallel: 'always'
Accepted Answer
More Answers (2)
Matt J
on 28 Mar 2014
2 votes
The nonlinear constraints are not differentiable, so that could well be the cause of the instability.
14 Comments
Alexander
on 28 Mar 2014
The 2nd order derivatives have to be continuous as well...
It's also possible there's a problem in the objective function, which you haven't shown.
Aside from that, I think it might be a better idea to solve the problem combinatorically, rather than using complicated nonlinear constraints. Apply fmincon repeatedly in a loop over all 63 possible partitions of x into 2 subsets. In each pass of the loop, use lb,ub constraints to force one subset to be positive and one subset to be negative throughout the optimization. Use linear constraints A*x<=b to force the sum of the positive subset to be <=1.3 and the negative susbet to be >=-.3. When the loop finishes, you look at all 63 solutions and take the solution with the minimum objective function value.
Alexander
on 28 Mar 2014
The simple objective function you may find above.
I notice now that the objective function is linear. This means that if you follow the combinatorial approach I outlined, the problem decomposes into 63 linear programming sub-problems which should be very rapidly solvable using LINPROG. That seems a lot more robust to me than trying to solve the problem non-linearly. Remember that a linear program is easy to minimize globally, unlike general nonlinear programs.
Because the dimension of your problem is small, it might even be faster than LINPROG to use LCON2VERT to solve the subproblems. This would allow you to find the vertices of the linearly constrained regions and evaluate the objective function there, exhaustively searching for the global min.
Alexander
on 4 Apr 2014
I have one other idea, but we're really getting to the point where we need to see this "instability" quantified. How big are the differences in the solutions? How big are the differences in their objective function values? Are the differences not accounted for by TolX, TolFun, TolCon, etc... Are you always running with the same starting x0? Even if all solutions are supposed to be global, note that they need not always be unique. For example, if all mean asset returns are equal, any set of feasible weights are globally optimal. I assume this is less of a problem as the returns get more distinctive from each other, but its not clear just how distinctive they need to be for given TolFun, TolX, TolCon.
You should turn the 'Display' option to 'iter' and show us the results of each iteration. You should also be calling fmincon with all 7 of its output arguments
[x,fval,exitflag,output,lambda,grad,hessian]= fmincon(...)
and showing us those as well. And, you should show us your latest implementation of nonlcon_g.
Having said all that, my one other suspicion of why things are unstable is that your objective function and constraints are all locally linear, assuming no x(i) falls into your small "eps" region. When everything is linear, your Hessian (of the Lagrangian) is singular almost everywhere. In fact, it is the zero matrix. Newton-like methods like active-set will do badly with such singularities. You might try the 'interior-point' algorithm instead and see if it makes a difference. It would also be interesting to see if the instability persists when you switch from a linear objective to a nonlinear one.
Alexander
on 5 Apr 2014
I don't see where you report the exitflag. It's possible, with your very small tolerances that the code simply runs until maximum numbers of iterations are reached, and fails to converge. The differences in output look rather small and inconsequential to me. On the other hand, if you use the same x0 all the time, it is unclear to me why fmincon doesn't behave deterministically, giving the same output all the time...
I would run your code if you also post a "ready-to-run" script that sets up all the arguments, e.g., options and A,Aeq,b,beq, and passes them to fmincon. It still might be a good idea if you post a .mat file with the 7 output arguments you see, so that people can compare to the output they get.
Alexander
on 6 Apr 2014
Alexander
on 6 Apr 2014
Alexander
on 8 Apr 2014
Hi Alex,
I've run your code, and also made some modifications of my own, which I've attached.
I do not see the instability that you describe when I run your version. I ran 10 consecutive trials and verified that the answers were identical across all of them. However, I vaguely wonder if you might get slightly different results because of UseParallel being turned on. Parallel processing routines can cause things to be executed in a somewhat random order. In theory the order of parallel tasks shouldn't matter, but in practice, for parallel reduction operations , there can be slight order-dependencies in the results due to floating point precision issues. That's just a guess, though -- I'm not even sure there are any reduction operations when fmincon runs in parallel mode. In any case, I don't really see that UseParallel would help you in terms of speed for such a small computation as this. I think it might even hurt you.
I do see limitations on the accuracy of the result. However, because you are not solving the original problem, but rather an epsilon-approximation of it, you can only expect to get approximate results. There is also an influence on accuracy of doing finite difference derivative computations.
In my version, I turned on the 'GradConstr' option and also switched back to the 'active-set' method. I also introduced a refinement step. It basically takes the solution of the epsilon-problem and uses that to decide which weights should be positive and which weights should be negative. It then constrains the search to that specific pattern of positives/negatives (similar to what I suggested earlier that you do combinatorically). This allows you to get rid of the nonlinear constraints, whereupon the solution becomes highly well-conditioned and accurate.
Finally, I corrected a bug in your implementations of max_smooth/min_smooth. You set nrAssets=length(x)-1 for some reason, when it really needs to be nrAssets=length(x).
Alexander
on 23 Apr 2014
0 votes
Categories
Find more on Surrogate Optimization in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!