MATLAB Answers

Error using GPU in Matlab 2020a for Reinforcement Learning

60 views (last 30 days)
I keep running into this error when using 'UseDevice',"gpu" in rlRepresentationOptions. The issue seems to appear after the the simulation happens for random period of time. I have tried this with multiple built-in examples and with both DDPG and TD3 agent. Could someone direct me if I am doing something wrong or is this a bug?
Error using rl.env.AbstractEnv/simWithPolicy (line 70)
An error occurred while simulating "IntegratedFlyingRobot" with the agent "agent".
Error in rl.task.SeriesTrainTask/runImpl (line 33)
[varargout{1},varargout{2}] = simWithPolicy(this.Env,this.Agent,simOpts);
Error in rl.task.Task/run (line 21)
[varargout{1:nargout}] = runImpl(this);
Error in rl.task.TaskSpec/internal_run (line 159)
[varargout{1:nargout}] = run(task);
Error in rl.task.TaskSpec/runDirect (line 163)
[this.Outputs{1:getNumOutputs(this)}] = internal_run(this);
Error in rl.task.TaskSpec/runScalarTask (line 187)
runDirect(this);
Error in rl.task.TaskSpec/run (line 69)
runScalarTask(task);
Error in rl.train.SeriesTrainer/run (line 24)
run(seriestaskspec);
Error in rl.train.TrainingManager/train (line 291)
run(trainer);
Error in rl.train.TrainingManager/run (line 160)
train(this);
Error in rl.agent.AbstractAgent/train (line 54)
TrainingStatistics = run(trainMgr);
Caused by:
Error using rl.env.SimulinkEnvWithAgent>localHandleSimoutErrors (line 689)
Invalid input argument type or size such as observation, reward, isdone or loggedSignals.
Error using rl.env.SimulinkEnvWithAgent>localHandleSimoutErrors (line 689)
Unable to compute gradient from representation.
Error using rl.env.SimulinkEnvWithAgent>localHandleSimoutErrors (line 689)
Unable to evaluate the loss function. Check the loss function and ensure it runs successfully.
Error using rl.env.SimulinkEnvWithAgent>localHandleSimoutErrors (line 689)
Input data dimensions must match the dimensions specified in the corresponding observation and action info specifications.

  4 Comments

Show 1 older comment
Daniel Egan
Daniel Egan on 7 Apr 2020
I am also having the same problem, TD3 and DDPG agents, 2020a, training on a GPU (1080Ti), and this problem occurs even when I feed the RL Agent block values from constant blocks rather than from my dynamic model. See picture below for setup.
The model will successfully work through ~5 episodes before this same error pops up.
Anh Tran
Anh Tran on 8 Apr 2020
We identified this is a bug in DDPG and TD3 with GPU training in R2020a. We are working on creating a patch to resolve this issue. As a workaround, please use CPU for DDPG and TD3.
Stav Bar-Sheshet
Stav Bar-Sheshet on 21 May 2020
Found a workaround to use until this bug will get solved.
You can still train the Actor with GPU and the Critic with CPU.
With this configuration you can also still use the parallel pool for gathering multiple experiences faster.
This reduced my training time in comparison to just train with CPU for both A & C.

Sign in to comment.

More Answers (0)