I am not experienced on Simulink and RL. I have tried to simulate a very simple scenario to test DDPG before implementing my complex system. The agent is randomly placed around (0,0) and the goal is to move to (500,500) or its nearby.
But it doesn't work for me. The action output (2x1) should be continuous in the range [-2 2]. For the first few episodes, the output oscillates between max and min and then stay on the minimum for the rest of the episodes.
I changed deep network settings as well as RL options but same problem. I have changed the output range and make it (-inf inf) with saturation but still the same. Also, I simulate it for a few thousand episodes but the same problem.
Codes are attached.