Can observation and reward be the same signal in a RL system?
5 views (last 30 days)
Show older comments
When I tried to train a RL system, I created a simulink model, where there is only one action and one observation, which is the reward. Then I encountered an error named" containing algebraic loop" when I tried to train it. So I wonder if the way I define observation and reward caused this problem.
The reason why I define reward and observation as the same signal is they act the same role in this system, I want the agent get only this signal from the environment, so I just define one observation representing both observation and reward for avoiding redundance.
0 Comments
Accepted Answer
Poorna
on 31 Mar 2024
Hi Jize,
I see that you want to use the same signal both as an observation and reward in your reinforcement learning setup. It is to be noted that observation and reward do not occur at the same time.
In a reinforcement learning setting you first make an observation i.e, the current state of the system, and then pick an action and execute it. Your system will then move to a new state. The reward that you get at the end of this transition is a function of your initial state, the action and the resultant next state. When you say you want to use the same signal as reward and observation. It means that the reward you get at time step 't', will be the observation at time step 't+1'.
The algebraic loop error you're encountering arises from attempting to use the reward at time step (t) directly as the observation at the same time step (t), which creates a paradoxical situation. This is because the system is being asked to observe a signal that has not yet been generated, resulting in a logical inconsistency.
So, you should try adding an "unit delay" block when you pass the reward as observation to the system. By doing this you are essentially sending the reward of previous transition as obsevation to the current transition.
To know more about the "unit delay" block, refer to the following documentation:
Hope this Helps!.
More Answers (0)
See Also
Categories
Find more on Environments in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!