MATLAB Answers

Custom Action Space DDPG Reinforcement Learning Agent

35 views (last 30 days)
After running into a challenge with my reinforcement learning agent I hope you can help me with at least a little hint.
My DDPG agent has a continuous action space which works totally fine. Unfortunately it cannot get transfered to a real-life system this way. Trying to find an optimal value for the actions in different situations the agent should avoid certain combinations.
The action space is defined like:
actionInfo = rlNumericSpec([4 1], ...
'LowerLimit', [0; 0; 0; 0], ...
'UpperLimit', [maxA1; maxA2; maxA3; maxA4]);
But due to restrictions in the real-life system it should more be like
A1 = (0 || [minA1; maxA1])
to avoid actions in the range
A1 = ]0; minA1[
Is there any possibility to define my action space this way?
Note:
I have already tried to route the agent to avoid actions in this range by penalizing it via the reward but it doesn't seem to work out. Instead of steadily improving over the episodes it now tends more to a sideways movement after reaching a certain (not desirable) level.
Thanks in advance!

Accepted Answer

Emmanouil Tzorakoleftherakis
To my knowledge, you cannot implement a custom action space with rlNumericSpec, but what you could possibly do (since adding penalty terms in the reward does not help), is to add some additional logic to manipulate the agent's actions/output of RL agent block. Your policy would then be the combined neural network+new logic. Just an idea
  3 Comments
Hans-Joachim Steinort
Hans-Joachim Steinort on 6 Mar 2020
Thank you for your explanation!
This actually helped me to wrap my head around this issue. I will definitively try out your suggestion with the additional logic and will come back to you afterwards.
EDIT:
It worked the way you suggested, thanks a lot!

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!