- Reward Function being inadequate: If the reward is sparse ,i.e, infrequent feedback to the agent or the reward scale is inappropriate, the agent will fail to learn properly. Ensure that the reward is shaped properly.
- Exploration Strategy : As TD3 benefits from a noise-based exploration strategy make sure that the exploration noise is appropriately scaled so that exploration is smooth without causing erratic behavior.
- Learning Parameters: You could try experimenting with learning rates for the actor and critic networks, as well as with different batch sizes and replay buffer capacities. You could also try adjusting the discount factor (gamma) and target update frequency.
No convergence during training using an TD3 RL agent
13 views (last 30 days)
Show older comments
I am trying to train an agent to navigate a multirotor to a particular 3d coordinate. I am using an TD3 agent with the configuration same as the Train Biped Robot to Walk using Reinforcement Leaning agent ( Link: Train Biped Robot to Walk Using Reinforcement Learning Agents - MATLAB & Simulink (mathworks.com) ). In my case i have 16 observation space and 4 action space. I have normalized both my observation space before passing it to my agent and the action output by the agent is also normalized between -1 to 1 which i later scale it up while passing it to the multirotor environment.
While training the agent rewards drop to zero all the time. I have attached the training results as an image. In the image you can see that the rewards are between 1000 and zero and the rewards keep droping to zero and the agent can't maintain a constant high reward.
Also the agent is trained using parallel computing.
0 Comments
Answers (1)
Ayush Anand
on 22 May 2024
The reward continuously droppng to zero suggests that the agent might be struggling with either the complexity of the task, the design of the reward function, or issues related to the training setup. Here are a few potential reasons behind the same:
You can refer to the following links to explore different options with the "rlTD3" agent in MATLAB:
0 Comments
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!