Training problem in DDPG agent

4 views (last 30 days)
Sam Chen
Sam Chen on 2 Mar 2020
Answered: Benjamin Feaster on 2 Mar 2020
I have a problem training with DDPG as shown below. The Episode Q0 became NaN at episode 658, I had saved all the agent during training and checked the parameter by 'getCritic' and 'getActor', it seems that all weights in neural network became NaN between agent657 and agent658. I can't figure out what happened during the training.

Answers (1)

Benjamin Feaster
Benjamin Feaster on 2 Mar 2020
I had this problem once as well when training an AC agent. This can happen when an equation(probably in your step function), tried to calculate one of the following:
zero/zero, zero*infinity, infinity/infinity, infinity-infinity.
Try troubleshooting with something like this at the end of your step function:
if any(isnan(NextObs), 'all') % if any element in NextObs matrix contains a NaN
[row, col] = find(isnan(NextObs)) % Display the row and column position in the matrix
end
Note this will also work with a NextObs vector. This will give you the row and column position of the first NaN value and ouput it to the command line. You can then determine which NextObs value this corresponds to and find where in your code that value is calculated.
Without looking at the code I can only give limited advice. Also make sure you have a "fallback" value when calculating your NextObs if your implementation requires it:
if something == 1
NextObs = 2; % your regular calculations you have implemented already
else
NextObs = -1; % return a number instead to represent the NextObs value doesnt apply for this step
end
I referenced two posts:
Roger Stafford's answer on this post, and Steven Lord's answer on this post. Hope this helps!

Products


Release

R2019b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!