Reusing scores during genetic algorithm optimization

I'm using GA to optimize a very expensive objective function. Since this is deterministic there are a lot of repeat runs using exactly the same input producing exactly the same output, especially in the mid-late generations where the population is quite homogeneous.
Is there an option to tell GA to maintain a (partial) database of fitness scores for already tested inputs? e.g. store the best 100 individuals and when a new individual is being evaluated, first compare it to the database to see if the score is already known.

 Accepted Answer

I recommend that you use patternsearch instead of ga. You can set the Cache option to have patternsearch skip recalculating points that it already visited.
patternsearch has nearly the same calling syntax as ga, so it is easy to adapt your code. The only difference, other than option settings, is that ga takes N, the number of dimensions, as the second argument, while patternsearch uses a starting point. To choose a random starting point within finite bounds, you can try
x0 = lb + rand(size(lb)).*(ub - lb);
But, if you really want to use ga, you can program your objective function to check a history of recent points. Use a nested function to build the history along the lines of this example. Then check whether a point in the recent history is close enough to an individual, and use the previously calculated value of the objective, or maybe an interpolated value from some nearby points. This will take some programming, but I believe it can be done.
It is usually better to use patternsearch, though. It is faster, more reliable, and easier to tune.
Alan Weiss
MATLAB mathematical toolbox documentation

1 Comment

Thanks, I'll check out pattern search.
I already have a workaround hack (I just save a table of inputs and resulting scores to a mat file. The time to load it is insignificant), but it seemed like a common enough problem that I'm a bit surprised there's nothing built-in. I'm lazy, I like using stuff that other people wrote before doing it myself :D

Sign in to comment.

More Answers (1)

I have not noticed such a thing documented.
You can do it on your end, in the objective function, by using persistent variables (amongst other possibilities.)
If the persistent variables are the empty array, [], then the variables need to be initialized.
Upon entry to the function, check if you have a record of those particular inputs. If you do, return the recorded output.
Otherwise do the calculation and then store the inputs and outputs together in the persistent variables.
Depending how long your input vector is, you might want to use a "map" object to do the storage. With short vectors, ismember() might be okay; with medium vectors, you could use a branching table of cell arrays. Unless, that is, you are lucky enough that your inputs can be mapped into non-negative integers, in which case you could do that conversion and index an array. (If you have bounded ranges, then dividing the range up into substates of equal length can make it easy to do an approximate conversion that would at least allow you to reduce the space you searched over.)

Asked:

on 20 May 2013

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!