Hi, I trained reinforcement learning agent with simulink for humanoid robot and looking at the robot configuraiton output while learning, the result of 'Agent 7' was good, so I saved the agent and proceeded with the simulation like the code below.
agent = load('Agent7.mat');
simOptions = rlSimulationOptions('MaxSteps', 50);
experience = sim(env,agent.saved_agent,simOptions);
However, it was different from the configuration output of Agent 7 in the learning process, so the action graph was observed using the scope. Graph during learning and simulation were different.
first picture is learning confituration output(Looking at the robot from left side) and action
second picture is simulation configuration output and action
During learning, the action showed discrete values, and this result was somewhat satisfactory. However, when the agent was saved and then simulated, the action value were only constants.
Could you please give me answer to solve this problem?
(I use matlab 2020b)