- Modify the Reward: Initially set the reward to zero for the first 20 seconds using a "Clock" block to track time, ensuring no learning occurs during this period.
- Skip Initial States: Use a "Switch" block controlled by a "Relational Operator" to ignore actions and states for the first 20 seconds, allowing the agent to interact only after this buffer period.
- Custom Reset: Begin each training episode from the state at 20 seconds. This approach skips the initial buffer and ensures the agent concentrates on the interactions that matter.
When DDPG optimizes PID parameters, how do I keep the first 10s of data from the system stabilization phase out of the experienceBuffer?
8 views (last 30 days)
Show older comments
Adaptive PID control using simulink's own Agent. Since the first 20 is a buffer process for the system, the first 20s are not part of the transition process, but are necessary to exist. How to make the Agent block the first 20s of action, state, reward and other information, or how to make the first 20s does not affect the training effect. In fact, if the first 10s are learned by the intelligent body, then the training effect is very poor.

0 Comments
Answers (1)
MULI
on 25 Mar 2025
To prevent the first 20 seconds from affecting your reinforcement learning agent in Simulink you can try the below approach:
These steps will help your agent focus on necessary parts of the simulation, further improving training efficiency.
0 Comments
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!