But only the mean gets worse, the best improves with every generation, and this is all that matters surely?
Random combinations with nonlinear responses can easily throw up huge values which distort the mean, while not affecting the overall progress of the search.
For example, imagine the following with a population of 7:
Gen 1 scores: 100, 100, 150, 140, 111, 100, 119
best score: 100
mean score: 117.14
Gen2 scores: 87, 66, 167, 1000000000000, 55, 98, 206
best score: 55
mean score: 1.4286 x 10^11
The really bad individual would have a very low chance of passing its genes to the next generation so it's ok.