Best choice of fmincon algorithm for high-cost 1D convex problem with "stochastic" gradient information?

7 views (last 30 days)
I am working on finding the melting point of a molecular dynamics simulation model by recasting it as a minimization problem. I have developed a high-cost stochastic blackbox function that I want to minimize.
  • Behind the blackbox, I'm running a molecular dynamics simulation of a solid-liquid interface at a specific input temperature T and detecting how the fraction of liquid vs solid changes in time. The MD simulation runs until the system completely freezes, completely melts, or a time limit is reached.
  • If the simulation freezes or melts, the blackbox function returns a number where is the time it takes for the system to melt or freeze (pc = phase change).
  • The blackbox function also returns a "gradient", equal to where if the system freezes or if the system melts.
  • This isn't an quite an exact gradient for the function as there is some stochasticity in , but I want to provide fmincon with information about the direction to take for the next step.
  • The time it takes to melt or freeze also increases as you approach the melting point, so is a natural choice of an approximate gradient and is a natural choice for the blackbox function value. As goes to infinity, goes to 0.
  • If the simulation reaches without melting or freeze, I declare that the melting point is reached and return and . I have fmincon 'FunctionTolerance' set to 0 so that the algorithm stops at the melting point.
  • Running this simulation is quite costly and takes ~30 minutes on 32 high-performance compute cores for each iteration, so I want to reduce the number of function evaluations as much as possible.
  • The problem is only 1-dimensional (input variable is the temperature) and should be convex, although is "stochastically convex" near the melting point (the exact time to freeze or melt depends on the particular simulation trajectory, which is chosen randomly from an ensemble). Still, the direction of the gradient is always correct, so it is useful to keep this information.
My question is which of the five fmincon algorithms is most suitable for my case? Or perhaps there is another MATLAB function better suited than fmincon? I don't have a strong grasp on the details of the algorithms in terms of their efficiency to my problem, even after reading through the documentation. Again, I really want to reduce the number of function evaluations as much as possible but also use the gradient-direction information that I have.

Answers (1)

John D'Errico
John D'Errico on 3 Dec 2021
Edited: John D'Errico on 3 Dec 2021
A 1-d problem, so ONE unknown? Don't use fmincon at all!
And if you wish to supply the gradient, then you cannot supply only an approximate gradient. That will just cause problems. Fmincon does not have an option where you can give it hints.
You MIGHT consider fminbnd, a tool designed to solve 1-d problems.
And of course, since you have a stochastic problem, in the sense that it will not even be differnetiable near the solution, again, fmincon is a terrible choice of tool. fmincon will assume differentiability. It sill assume your objective will return a consistent value, so if you evaluate the function a second time at the same point, it should return the same value. So even fminbnd will have issues due to a stochastic objective.
I would probably just write my own code, where I use the last few objective evauations to form a local polynomial approximation to the problem. At the same time, you can use that modeling to determine an approximation for the level of noise to be expected. And that will tell the algorithm when to quit, that further searching will not yield a better approximation to the solution.
Finally, could you use such a tool to include prior information on the approximate gradient as a hint in the search? Well, yes, you could. Now the problem will start to look like a Kalman filter. Estimate the coefficients of a local quadratic polynomial, given some prior information about the derivative of said polynomial. Then find the minimum of the polynomial as your next iteration. Since you also have information about the uncertainty in your coefficients from the kalman filter, that will tell you how much you can trust that next step, and tell you when to stop looking. This is code you would write yourself of course.
  4 Comments
Hayden Scheiber
Hayden Scheiber on 3 Dec 2021
Edited: Hayden Scheiber on 3 Dec 2021
Thank you for your responses John, they have been helpful.
I would classify my problem as only "slightly stochastic". With repeated simulations, it seems to seem to reproduce the same evaluation for a given input temperature with only a small error. In the thermodynamic limit, my function would approach fully deterministic. The stochastic derivatives are close enough to deterministic such that, in preliminary tests so far, I can get a good answer using the deterministic fmincon default algorithm in most cases. I need to check just how noisy it actually is.
I realize that using non-exact derivatives will cost me compared to a case with exact derivatives. However, I think it remains to be seen whether my slightly stochastic derivatives speed up convergence compared to not using them (direct search). It all depends on how much error is really in these blackboxfunction evaluations, which I have not yet quantified. I think I'll test this with repeated calls at the same temperature to build up a distribution of outcomes.
Regarding your third paragraph that's simply not true. For example, stochastic bayesian optimization is very well suited for stochastic objectives with a high cost objective function, and apparently someone has recently developed a bayesian optimization algorithm for use with noisy deriatives, although this feature is not implemented in Matlab's bayesopt unfortunately. In any case, I think I'm just going to try a few solvers including bayesopt, fminbnd, fmincon with different algorithms, and patternsearch to explore the problem deeper.
My main question here was which of the fmincon algorithms is best suited for a 1D convex problem like mine where derivatives are available (ignoring the issue of stochastistity).
Thanks for the suggestion about response surface optimization, it sounds similar to surrogate optimization, which is basically how bayesian optimization functions.
John D'Errico
John D'Errico on 3 Dec 2021
Edited: John D'Errico on 3 Dec 2021
While there will certainly be tools in existence, they are not provided in the optimization TB. I lack experience with tools like bayesopt to help you in that respect, sicne I lack that TB.

Sign in to comment.

Products


Release

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!