Why Genetic Programming result is not consistent?
2 views (last 30 days)
Show older comments
Benjamin
on 16 Jan 2020
Commented: Walter Roberson
on 17 Jan 2020
Hello
Dear experts
I am working on finding a relationship on some input parameters (X1, ... , X5) and an output parameter (Y). I am using Genetic Programming Toolbox in MATLAB and the provided example and just changed the input/output data to build my own equation. The problem is that every time I run the program it finds different parameters as "the best" model with different values for fitness and mse!
I don't know if that's how GP works, but is there any way that I make the code be in a way that considers all possible functions ('+','*',...) and find the best model?
I will attach my program here.
Thank you in advance for your help :)
4 Comments
Stephen23
on 16 Jan 2020
Edited: Stephen23
on 16 Jan 2020
@Benjamin: sorry, my mistake for confusing Genetic Programming with Genetic Algorithm ... a momentary synaptic lapse (hangs head in shame).
While FEX is a great resource of third-party code, you have to be very selective about choosing which submissions are worthwhile using or not: that might be a well-written toolbox... or it might not be. In any case, contact its author if there is something that you want to know that is not answered by reading its documentation.
Accepted Answer
Walter Roberson
on 16 Jan 2020
Genetic programming is always random.
And NO there is no way to get it to consider all combinations of operations while still using genetic programming: it is part of the definition of genetic programming.
3 Comments
Walter Roberson
on 16 Jan 2020
If you were to discover through the searches that,
X1 - X3/log(X4)
---------------
X2 * (X4^(X3 - log(X1)*X2)) - X3^(X2 - exp(X1))
-----------------
X4 / (log(X3) - X2)
gives a marginally better fit for Y... then chances are that you should be throwing that all out and starting over again.
Remember, that if you iterate through a bunch of models, then one of them is going to come out with the lowest residue. That doesn't mean that the one with the lowest residue one has any relationship at all to the true physics of the situation; there are a lot of mathematical accidents.
Walter Roberson
on 17 Jan 2020
I just spent a while fitting using the model
a1*X1^a2 + a3*X2^a4 + a5*X3^a6 + a7*X4^a8 + a8*X5^a9
At the moment the best residue I have found for that is near
+5.89527e-12 * X1^(14.0668) - 2.43172e+06 * X2^(-6.07609) - 141.217 * X3^(4.97211) + 1.36918e+07 * X4^(-16.9352) + 0.00796812 * X5^(0.000446426)
but that is a barely different residue from (for example)
+5.59686e-11 * X1^(15.8437) - 333705 * X2^(-5.46948) - 64.2897 * X3^(5.41647) + 1.36918e+07 * X4^(-11.4162) + 0.00838126 * X5^(0.115229)
Mathematical accidents. Cobbling together a formula that generates a marginally better residue is useless when you have no reason to believe that the formula has relevance to the real world.
More Answers (0)
See Also
Categories
Find more on Genetic Algorithm in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!