How does one verify the results of simulation from genetic algorithm

Recently I am using genetic algorithm to optimize a 2 by 2 controller.The optimization runs well and I manage to obtain the result.
However from the simulation result it is shown that the mean fitness value does not change much during the simulation as in it is almost a straight line for the whole simulation.I tried to change a few options like crossover,mutation,selection and even increase the population and the range of the initial population but there is no change on how the mean fitness proceed during the simulation.I attached a diagram here for illustration.I read a few examples on MATLAB and usually the way mean fitness proceed is almost like the best fitness value where there is a slight curve in a decreasing manner as we proceed from one generation to another generation.So my questions are below
1)Is it normal for the mean fitness value of my optimization process to behave in such a manner?If it is an indication of bad result,how can I proceed to improve the result of the simulation?
2)How does one confirm or verify that the results from the optimization via GA (genetic algorithm) is good?
Please kindly give me some guidance on this.Thanks.

2 Comments

Yes, this behaviour is possible (though may not be desirable). It could be that the population has prematurely converged to a sub-optimal solution. So it could be a "good" solution, just not the best one.
What changes to the mutation, crossover, and selection operators did you make? What population sizes have you been using? Because you can modify these parameters to help avoid or delay premature convergence (i.e. increase the mutation in a population).
On multiple runs of the GA, do you always get the same result (does the population always converge to the same solution)? The documentation states that the initial population is randomly generated using a uniform random number generator (doesn't say which one), but it could be that the same set of random numbers is being used for the initial populations at each run. (I suppose you could verify this if you can print out the initial populations.) It may be useful (and it can't hurt!) to seed the random number generators via rng. Use rng('shuffle') which seeds the random number generator based on the current time.
Hi Geoff.Nice to talk to you again.
The changes that I made on the options of the genetic algorithm are shown in the program that I attached (calculation_error2.m)
On multiple runs (5 runs to be exact) I am getting almost the same reading for the best fitness and the mean fitness value.I also initiate rng ('shuffle') as recommended by you.The value for best fitness also stay roughly the same during the simulation as shown on the command window from the diagnostic information.
How can I modify the parameters to avoid premature convergence and get the best solution?Is there any scientific way to achieve that rather than trial and error which is time consuming and not efficient?

Sign in to comment.

 Accepted Answer

Hi Kah,
Five runs without much change could mean either that your code has found the optimal solution or it is just getting stuck.
I noticed that you are using the following for the crossover
options = gaoptimset(options,'CrossoverFraction',0.8,...
'CrossoverFcn',@crossovertwopoint);
According to the documentation (see crossover options), this crossover function works as follows
Two point (@crossovertwopoint) selects two random integers m and n between 1 and Number of variables. The function selects
Vector entries numbered less than or equal to m from the first parent Vector entries numbered from m+1 to n, inclusive, from the second parent Vector entries numbered greater than n from the first parent. The algorithm then concatenates these genes to form a single gene.
For example, if p1 and p2 are the parents [a b c d e f g h] and [1 2 3 4 5 6 7 8] respectively, and the crossover points are 3 and 6, the function returns the following child [a b c 4 5 6 g h].
So while it does produce a child that is different from either parent, i.e. the child has a subset of the variables from both parents, it does not change the variables at all - they still have the same values as the parent variables. This means that the variable values will only change due to a mutation.
I suggest you try the crossoverintermediate, crossoverarithmetic, or crossoverheuristic. All three indicate that the child variable will be some combination of the two from the parents - so it won't necessarily be the same. The first two should ensure (if you leave the default Ratio as is) that the resulting child variable is still within the interval defined for that variable. The crossoverheuristic may not do that unless you set the Ratio to be one or less. It has an interesting algorithm whereby the child variable will be closer to the variable of the parent with the better fitness. So you could try this one first with a ratio of 0.95 say. The default ratio is 1.2 so you could leave as is, though your new child variable value may fall outside of the interval (unless the GA code can be told to enforce intervals).
_____________________________
I also noticed that the intervals for your 11 variables are very narrow. If you look at the difference between the lower and upper bound for each one, we see them as
1.123
3.38
0.999
0.68177
2.476
1.049
0.3739
2.193
0.3287
1.0671
0.691
For the seventh and ninth variables, the difference between their lower and upper bounds are 0.3739 and 0.3287 respectively. I don't know what either variable represents, but I just wonder what would happen if you fix both to something and then reduce your problem to a nine variable optimization problem, would that lead to a better solution?
Try changing the crossover operator to the heuristic one - leave the default ratio of 1.2 and see what happens. Continue to decrease it on subsequent runs to 0.95, 0.85,...,0.45 (maybe no lower than this as then the parent with the worse fitness will have its variables weighted more). If you see no change in the convergence of your solution, then try reducing your problem from 11 variables to 10, to 9 for those with the small intervals.
_____________________________
By the way, how long does it take one run of the GA (so population 3000, 100 iterations/generations)?

6 Comments

Hello Geoff
1)I tried to apply @crossoverintermediate,@crossoverarithmetic and @crossoverheuristic and at the same time manipulate the ratio as recommended but I still get back roughly the same value for the best fitness value and the mean fitness value. I think the optimal points are already achieved.I switched to use genetic algorithm to conduct the optimization because previously I was using "fminsearch" and the optimized points are no difference than the initial point that I inserted to initiate the optimization.
2)This must be an eleven variables problem as they represent the parameters of the controller that I am trying to optimize now.The range provided for the optimization are the parameter's lowest value and highest value at different operating regime.(I am investigating 4 different operating regime)
3)Sorry I dont know how to check the how long does it take one run of GA.How to do that?
Currently I am running more simulations and varying ratio of the @crossoverheuristic and then finally apply the optimized controller on my nonlinear plant.Hope I can get a better response compare to the default controller.
Kah - for Sorry I dont know how to check the how long does it take one run of GA.How to do that? Just use the tic and toc functions to time a single run. I was wondering if you could give a rough estimate - two minutes, ten, or thirty?
Hi Geoff.It takes only 20 seconds to finish the optimization process.
Hi Kah - so you could run 100s of simulations then without it taking too long. Since you always seem to be getting the same solution, I suppose you could try to "penalize" that one in the fitness function so that other optimal solutions may be encouraged to dominate. In the fitness function, you could write some code that would check to see if the 11 variables are close (to within some tolerance) of those of the usual solution. If true, then just increase (double?) the fitness function output.
Hi Geoff.
I don't understand what you mean above.Do you mind give me some example to illustrate what you mean?
Today I tried to run multiple simulations by manipulating the rate and ratio for @mutationuniform and @crossoverheuristics respectively and finally I manage to obtain optimized controller that give me good result compared to the default controller. I am running 10 simulations for each case as shown below.
Simulation 1 - @mutationuniform at rate of 0.3 1)@crossoverheuristics at ratio of 0.45 (10 simulations) 2)@crossoverheuristics at ratio of 0.85 (10 simulations) 3)@crossoverheuristics at ratio of 0.95 (10 simulations)
Simulation 2 - @mutationuniform at rate of 0.5 1)@crossoverheuristics at ratio of 0.45 (10 simulations) 2)@crossoverheuristics at ratio of 0.85 (10 simulations) 3)@crossoverheuristics at ratio of 0.95 (10 simulations)
I manage to get an optimized parameters that will improve the performance of the controller in the performance criterion that I am researching.The optimized controller is obtained when the rate for @mutationuniform is 0.5 and the ratio for @crossoverheuristics is 0.95 at 4th simulation.
The value for best fitness value and the mean fitness value are still quite close though.
:)
Awesome that you got some good results, Kah!
For the "penalty", I had meant something along the lines of: suppose that x1,x2,x3,…,x11 are the optimized values for each of the 11 values. If you pass these 11 variables into the fitness function, the scalar result is y (say). And you know all of this from your previous runs/simulations. Now suppose that a parent or child has 11 variables that are considered near-identical to x1,x2,…,x11 i.e. if x is the input vector to the fitness function, then within this function do
tolerance = 0.000001;
if abs(x1-x(1))<tolerance && ...
abs(x2-x(2))<tolerance && ...
%etc.
abs(x11-x(11))<tolerance
% the input variables are considered near-identical to the
% optimal solution found on previous runs, so just double
% the fitness function value
outFitnessVal = 2*y;
end
So now, this parent or child has a fitness function value that is double what it should be, and so it may be discouraged from "reproducing" and so another solution may become optimal.
Since your results improved with varying the mutation rate and the crossover ratio, you can now ignore this. Though it would be interesting to see what happens if you run 100s of simulations.
As for The value for best fitness value and the mean fitness value are still quite close though, I think that this is expected - over time, the population will converge to the optimal solution, so the average should start to edge closer to this optimal solution.

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!