Reinforcement learning actions using DDPG
10 views (last 30 days)
Show older comments
Jason Smith
on 2 Nov 2020
Commented: Jason Smith
on 12 Nov 2020
Greetings. I'm Jason and I'm working on controlling a bipedal using reinforcement learning. I need help to decide between the two methods below using DDPG:
1_ Generate random actions with Noise variance of %10 of my action range based on descriptions of the DDPG noise model
2_ Using a low variance like 0.5 as they have used in have used in MSRA biped and humanoid training with RL.
I really appreciate it if you could help me with this. And in the latter case, the actions are the output of a tanh layer with low variance([-1.5 1.5]), how is it converted into desired actions?
Please consider that I'm pretty sure that the range of actions I have calculated is good enough to solve the problem and also I tried using higher variances but it makes the learning process less stable. Any sugguestions on how I should generate the random actions?
Thanks in advance for your time and consideration
0 Comments
Accepted Answer
Emmanouil Tzorakoleftherakis
on 11 Nov 2020
Hi Jason,
In the documentation link you provided it's mentioned "Variance*sqrt(Ts) be between 1% and 10% of your action range". The biped example you ar elinking to has Ts = 0.025 and Variance = 0.1 which is about 1% of action range.
To your second question, please have a look at step 1 here. Effectively, during training only, random noise sampled using the noise options you provide will be added to the normal output of your agent. So if your last layer is a tanh layer, you will first get a value in [-1,1] and noise will be added on top of that.
Hope that helps.
More Answers (0)
See Also
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!