When DDPG optimizes PID parameters, how do I keep the first 10s of data from the system stabilization phase out of the experienceBuffer?

Question

Yu on 31 Jan 2025

0
Link

Direct link to this question

https://in.mathworks.com/matlabcentral/answers/2173541-when-ddpg-optimizes-pid-parameters-how-do-i-keep-the-first-10s-of-data-from-the-system-stabilizatio

Answered: MULI on 25 Mar 2025

Adaptive PID control using simulink's own Agent. Since the first 20 is a buffer process for the system, the first 20s are not part of the transition process, but are necessary to exist. How to make the Agent block the first 20s of action, state, reward and other information, or how to make the first 20s does not affect the training effect. In fact, if the first 10s are learned by the intelligent body, then the training effect is very poor.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

MULI on 25 Mar 2025

0
Link

Direct link to this answer

https://in.mathworks.com/matlabcentral/answers/2173541-when-ddpg-optimizes-pid-parameters-how-do-i-keep-the-first-10s-of-data-from-the-system-stabilizatio#answer_1562446

Hu @Yu,

To prevent the first 20 seconds from affecting your reinforcement learning agent in Simulink you can try the below approach:

Modify the Reward: Initially set the reward to zero for the first 20 seconds using a "Clock" block to track time, ensuring no learning occurs during this period.
Skip Initial States: Use a "Switch" block controlled by a "Relational Operator" to ignore actions and states for the first 20 seconds, allowing the agent to interact only after this buffer period.
Custom Reset: Begin each training episode from the state at 20 seconds. This approach skips the initial buffer and ensures the agent concentrates on the interactions that matter.

These steps will help your agent focus on necessary parts of the simulation, further improving training efficiency.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

When DDPG optimizes PID parameters, how do I keep the first 10s of data from the system stabilization phase out of the experienceBuffer?

0 Comments
Show -2 older commentsHide -2 older comments

Answers (1)

0 Comments
Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

When DDPG optimizes PID parameters, how do I keep the first 10s of data from the system stabilization phase out of the experienceBuffer?

0 Comments Show -2 older commentsHide -2 older comments

Answers (1)

0 Comments Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments