DDPG agent low performance
1 view (last 30 days)
Show older comments
Hello everyone,
I am trying to train a DDPG agent for my system, and the goal is to generate actions (mf) to follow desired Torque. The attached figure shows episodic award vs. the number of episodes and plot of the system ( output, error, reward, and action). I set the output range from 5 to 30, but agent steel oscillating around these values. Although the training performance in episodic reward seems to converge, steel I am experiencing oscillatory response.
I would appreciate it if someone could help me with this matter.
here is my reward block in simulink:
It worth mentioning that I am using standar parameters for noise model:
agentOpts = rlDDPGAgentOptions(...
'SampleTime',Ts,...
'TargetSmoothFactor',1e-3,...
'DiscountFactor',0.99, ...
'MiniBatchSize',64, ...
'ExperienceBufferLength',1e6);
agentOpts.NoiseOptions.Variance = 0.05*(25/sqrt(Ts));
agentOpts.NoiseOptions.VarianceDecayRate = 1e-5;
here is my actor and critic structure:
L = 500; % number of neurons
statePath = [
featureInputLayer(numObservations, 'Normalization', 'none', 'Name', 'observation')
fullyConnectedLayer(L, 'Name', 'fc1')
reluLayer('Name', 'relu1')
fullyConnectedLayer(L, 'Name', 'fc2')
additionLayer(2,'Name','add')
reluLayer('Name','relu2')
fullyConnectedLayer(L, 'Name', 'fc3')
reluLayer('Name','relu3')
fullyConnectedLayer(1, 'Name', 'fc4')];
actionPath = [
featureInputLayer(numActions, 'Normalization', 'none', 'Name', 'action')
fullyConnectedLayer(L, 'Name', 'fc5')];
actorNetwork = [
featureInputLayer(numObservations, 'Normalization', 'none', 'Name', 'observation')
fullyConnectedLayer(L, 'Name', 'fc1')
reluLayer('Name', 'relu1')
fullyConnectedLayer(L, 'Name', 'fc2')
reluLayer('Name', 'relu2')
fullyConnectedLayer(L, 'Name', 'fc3')
reluLayer('Name', 'relu3')
fullyConnectedLayer(numActions, 'Name', 'fc4')
tanhLayer('Name','tanh1')
scalingLayer('Name','ActorScaling1','Scale',max(actInfo.UpperLimit))];
actorOptions = rlRepresentationOptions('LearnRate',1e-4,'GradientThreshold',1,'L2RegularizationFactor',1e-4);
actor = rlDeterministicActorRepresentation(actorNetwork,obsInfo,actInfo,...
'Observation',{'observation'},'Action',{'ActorScaling1'},actorOptions);
0 Comments
Answers (0)
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!