MATLAB Answers

How to TRAIN further a previously trained agent?

99 views (last 30 days)
Hi,
My agent was programmed to stop after reaching an average reward of X. How do I load and extend the training further?
I did enable saving of the experiences and it has created the agent file
Rajesh

  0 Comments

Sign in to comment.

Accepted Answer

Rajesh Siraskar
Rajesh Siraskar on 11 Dec 2019
Hi Sourav, I figured it out after reading the documentation moer carefully!
I need to also set the ResetExperienceBufferBeforeTraining flag if I need to use previously saved experiences
This is my working code snippet. I must say this is a great feature and I really missed knowing about it!
USE_PRE_TRAINED_MODEL = true; % Set to true, to use pre-trained
% Set agent option parameter:
agentOpts.ResetExperienceBufferBeforeTraining = not(USE_PRE_TRAINED_MODEL);
if USE_PRE_TRAINED_MODEL
% Load experiences from pre-trained agent
sprintf('- Continue training pre-trained model: %s', PRE_TRAINED_MODEL_FILE);
load(PRE_TRAINED_MODEL_FILE,'saved_agent');
agent = saved_agent;
else
% Create a fresh new agent
agent = rlDDPGAgent(actor, critic, agentOpts);
end
% Train the agent
trainingStats = train(agent, env, trainOpts);

  3 Comments

Adrian Kaessens
Adrian Kaessens on 12 Dec 2019
In the case of DDPG does this start the amount of noise from the beginning or will it go on from where it left of in the training process ?
Rajesh Siraskar
Rajesh Siraskar on 8 Jan 2020
Good question Adrian: I have noticed that the noise parameters depend on the training code and parameters that you use when you restart training.
So for example lets say you had var. 0.3 and a decay rate of 1e-5. After training obviously the noise addition would have decayed, now lets say you saved this and reuse it.
When you retrain, if you settings are the same 0.3 and 1e-5, then I believe, the training resets the noise parameters so it will start afresh with this noise model parameters and decay all over again.
Anh Tran
Anh Tran on 21 Feb 2020
Rajesh is correct. Currently the noise model resets when you train again. We are looking into how you can truly 'resume' training. As a workaround, you can set the noise variance option to a lower value than that of your previous train session.

Sign in to comment.

More Answers (2)

Sourav Bairagya
Sourav Bairagya on 10 Dec 2019
In this case, you can resume your training with the previous experience buffer as a starting point.
You have to set the 'SaveExperienceBufferWithAgent' agent option to 'true'.
For some agents, such as those with large experience buffers and image-based observations, the memory required for saving their experience buffer is large. In these cases, you must ensure that there is enough memory available for the saved agents.
For more informations you can leverage this link:

  2 Comments

Rajesh Siraskar
Rajesh Siraskar on 10 Dec 2019
Hi Sourav,
I did go through the documentation and I did enable SaveExperienceBufferWithAgent.
But how do I next load this agent object and start the training from there on? I can use this line of code below to load the object, but how do I proceed next to begin adding to this experience?
load(RL_MODEL_FILE,'saved_agent');
Code that enabled SaveExperienceBufferWithAgent:
agentOpts = rlDDPGAgentOptions(...
'SampleTime', Ts,...
'TargetSmoothFactor', 1e-3,...
'DiscountFactor', GAMMA, ...
'MiniBatchSize', BATCH_SIZE, ...
'ExperienceBufferLength', 1e6, 'SaveExperienceBufferWithAgent', true);
mr robot
mr robot on 30 Jan 2020
How large is "large" for an experience buffer of 1e6?

Sign in to comment.


Anh Tran
Anh Tran on 21 Feb 2020
I will answer again, hopefully clear your confusion.
% Train the agent
trainingStats = train(agent, env, trainOpts);
After this line, even though the 'agent' is not returned as an output, its learnable parameters are updated. Learnable parameters, e.g. the weights and biases of the actor/critic neural networks, determines the logic behind the agent (and how it chooses action given an observation).
Now if you execute sim() or train() after this line, the 'agent' will simulate or continue training with the latest parameters.
Rajesh's workflow is very close to resume training (reuse the experiences gathered in the past, start from latest parameters). I revised the code with additional comments. Currently the noise model resets when you train again. You can consider setting the noise variance option to a lower value (still need to be > 0 because we want the agent to always explore) than that of your previous train session.
% Set to true, to resume training from a saved agent
resumeTraining = true;
% Set ResetExperienceBufferBeforeTraining to false to keep experience from the previous session
agentOpts.ResetExperienceBufferBeforeTraining = ~(resumeTraining);
if resumeTraining
% Load the agent from the previous session
sprintf('- Resume training of: %s', PRE_TRAINED_MODEL_FILE);
load(PRE_TRAINED_MODEL_FILE,'saved_agent');
agent = saved_agent;
else
% Create a fresh new agent
agent = rlDDPGAgent(actor, critic, agentOpts);
end
% Train the agent
trainingStats = train(agent, env, trainOpts);

  1 Comment

Stav Bar-Sheshet
Stav Bar-Sheshet on 4 Jun 2020 at 8:01
Hi, this is an excellent thread!
What I'm curios about is if you continue training doest the state of the optimizer is saved and continues from the same point?

Sign in to comment.