Help me understand the Architecture of DQN cor Cartpole problem in RL

4 views (last 30 days)
I am new in using Matlab for solving reinforcement learning problem, and I am trying to follow up the example found in 'https://uk.mathworks.com/help/reinforcement-learning/ug/train-dqn-agent-to-balance-cart-pole-system.html'. Honestly, I used python to solve cart pole problem, and I understand the structure of deep QN fully. However, for Matlab, it confuses me entirely since I don't understand how neural of QN are arranged.
Since I know for Deep learning, you do have the input layer as the first layer then follow the hidden layer and lastly output layer. But for DQN here in Matlab what I see is states are input then follow some hidden layer and then another input which is action comes after specific layers, which are hidden layers. I don't understand this architecture, and I will appreciate it if someone would explain to me clearly. Also, if possible, the explanation and simple drawing DQN architecture with the state and action will be of great value.

Answers (1)

Emmanouil Tzorakoleftherakis
Hi Michael,
There are various architectures you can use when setting up the Q-network. In the example you mentioned and most examples that have a Q-critic in Reinforcement Learning Toolbox the state and action path are separated. The reason is that you can architect these paths as necessary to extract useful features. For instance, in this example, one the state input is an image and the action is scalar torque. The image path needs to go through convolutional layers for example to extract features, but this is not necessary for a scalar input. This is why these paths are separated.
You can visualize neural networks in two ways:
figure
plot(criticNetwork)
or
deepNetworkDesigner
and load criticNetwork from workspace to see an interactive representation of the critic.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!