Why Genetic Programming result is not consistent?

2 views (last 30 days)
Hello
Dear experts
I am working on finding a relationship on some input parameters (X1, ... , X5) and an output parameter (Y). I am using Genetic Programming Toolbox in MATLAB and the provided example and just changed the input/output data to build my own equation. The problem is that every time I run the program it finds different parameters as "the best" model with different values for fitness and mse!
I don't know if that's how GP works, but is there any way that I make the code be in a way that considers all possible functions ('+','*',...) and find the best model?
I will attach my program here.
Thank you in advance for your help :)
  4 Comments
Stephen23
Stephen23 on 16 Jan 2020
Edited: Stephen23 on 16 Jan 2020
@Benjamin: sorry, my mistake for confusing Genetic Programming with Genetic Algorithm ... a momentary synaptic lapse (hangs head in shame).
While FEX is a great resource of third-party code, you have to be very selective about choosing which submissions are worthwhile using or not: that might be a well-written toolbox... or it might not be. In any case, contact its author if there is something that you want to know that is not answered by reading its documentation.
Benjamin
Benjamin on 16 Jan 2020
Oh no need to apologize Stephen :) I appreciate you response.
It is just interesting to me that such helpful Toolbox that MATLAB provides doesn't have a comprehensive documentation, and even not many ask and answers on Matlab community!

Sign in to comment.

Accepted Answer

Walter Roberson
Walter Roberson on 16 Jan 2020
Genetic programming is always random.
And NO there is no way to get it to consider all combinations of operations while still using genetic programming: it is part of the definition of genetic programming.
  3 Comments
Walter Roberson
Walter Roberson on 16 Jan 2020
If you were to discover through the searches that,
X1 - X3/log(X4)
---------------
X2 * (X4^(X3 - log(X1)*X2)) - X3^(X2 - exp(X1))
-----------------
X4 / (log(X3) - X2)
gives a marginally better fit for Y... then chances are that you should be throwing that all out and starting over again.
Remember, that if you iterate through a bunch of models, then one of them is going to come out with the lowest residue. That doesn't mean that the one with the lowest residue one has any relationship at all to the true physics of the situation; there are a lot of mathematical accidents.
Walter Roberson
Walter Roberson on 17 Jan 2020
I just spent a while fitting using the model
a1*X1^a2 + a3*X2^a4 + a5*X3^a6 + a7*X4^a8 + a8*X5^a9
At the moment the best residue I have found for that is near
+5.89527e-12 * X1^(14.0668) - 2.43172e+06 * X2^(-6.07609) - 141.217 * X3^(4.97211) + 1.36918e+07 * X4^(-16.9352) + 0.00796812 * X5^(0.000446426)
but that is a barely different residue from (for example)
+5.59686e-11 * X1^(15.8437) - 333705 * X2^(-5.46948) - 64.2897 * X3^(5.41647) + 1.36918e+07 * X4^(-11.4162) + 0.00838126 * X5^(0.115229)
Mathematical accidents. Cobbling together a formula that generates a marginally better residue is useless when you have no reason to believe that the formula has relevance to the real world.

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!