I am using the built in Genetic Algorithm functionality in the Optimization Toolbox and time is a serious constraint when running the code.
The problem in question is an integer (binary) problem with linear constraints.
The solutions are on the order of a 400x1 vector of ones and zeroes, and the population size is 500. All other (relevant) Genetic Algorithm options are left as default.
The main bottleneck on run time is within the fitness function (it does not receive the population in vectorized form because this conflicts with using a parallel pool). It has to perform numerous operations on a 3D matrix that is roughly 300000x4x10. These operations are vectorized as much as possible to take advantage of Matlab's speed.
I start a parallel processing pool before calling the algorithm as follows:
c = parcluster('local');
numberOfCores = 6;
With 6 cores it takes roughly five minutes for each generation to complete (when running as a compiled DLL called by .NET). I have tried to set up an AWS server with 20 cores in hopes of speeding up the run time of each generation, but instead the run time per generation was roughly the same or slightly worse. (Matlab verified that the 20 core pool was properly created).
Does anyone know why that would be the case? Is there some way to estimate if there is a "sweet spot" for number of cores in a parallel processing pool compared to any potential overhead associated with parallelizing operations?