Applications

Examples of how to apply reinforcement learning

Reinforcement learning can be applied to a variety of problems in different fields, such as control, robotics, scheduling, optimization, and finance. Here are some examples.

Tutorials

Train Agents to Perform Control Tasks

Control Water Level in a Tank Using a DDPG Agent
Train a controller using reinforcement learning with a plant modeled in Simulink^® as the training environment.
Tune Single PI Controller Gains For Multiple Operating Points Using Reinforcement Learning
Tune the gains of a PI controller using a TD3 agent.
Train SAC Agent for Ball Balance Control
Train a SAC agent to balance a ball on a flat surface using a robot arm.
Train Default TD3 Agent to Control Quanser QUBE Pendulum
Train a TD3 agent to balance the Quanser QUBE rotational inverted pendulum.
Train Reinforcement Learning Agent Offline to Control Quanser QUBE Pendulum
Train TD3 agent offline to control a Quanser QUBE pendulum.
Train TD3 Agent for PMSM Control
Train a TD3 agent to control the currents in a permanent magnet synchronous motor.
Field-Oriented Control of PMSM Using Reinforcement Learning (Motor Control Blockset)
This example shows you how to use the control design method of reinforcement learning to implement field-oriented control (FOC) of a permanent magnet synchronous motor (PMSM).
Train DQN Agent with LSTM Network to Control House Heating System
Train a DQN agent with a recurrent network to control the temperature of an house.
Train Reinforcement Learning Agent with Constraint Enforcement (Simulink Control Design)
Train a reinforcement learning agent with actions constrained using the Constraint Enforcement block.
Create and Train Custom LQR Agent
Create a custom agent that solves an LQR problem and train it using the built-in train function.

Train Agents to Control Robots

Train DDPG Agent to Control Two-Thruster Sliding Vehicle
Train a DDPG agent to control a robot sliding over a frictionless 2-D plane.
Train Default PPO Agent for Discrete Lander Vehicle
Train a default PPO agent to land a discrete action space flying vehicle.
Train Soft Actor Critic Agent with Custom Networks for Discrete Lander Vehicle
Train a SAC agent to land a discrete action space flying vehicle.
Train Biped Robot to Walk Using Reinforcement Learning Agents
Compare DDPG and TD3 agent for the control a biped walking robot modeled in Simscape™ Multibody™.
Add Safety Constraint to Simulate Two-Link Robot with SAC Agent
Add high-order barrier function to safely simulate a two-link robot model with a SAC agent.
Train Biped Robot to Walk Using Evolution Strategy-Reinforcement Learning Agents
Train TD3 agent using evolutionary strategy.
Quadruped Robot Locomotion Using DDPG Agent
Train a DDPG agent to control a quadruped walking robot modeled in Simscape Multibody.

Generate Rewards from Control Specifications

Generate Reward Function from a Model Predictive Controller for a Servomotor
Generate a reward function from an MPC controller applied to a servomotor and use it to train a TD3 agent.
Generate Reward Function from a Model Verification Block for a Water Tank System
Generate a reward function from an model verification block applied to a water tank system and use it to train a TD3 agent.

Imitation Learning

Imitate MPC Controller for Lane Keeping Assist
Train a deep neural network to imitate the behavior of a model predictive controller within a lane keeping assist system.
Imitate Nonlinear MPC Controller for Sliding Robot
Train a deep neural network to imitate the behavior of a nonlinear model predictive controller for a robot siding on a 2-D frictionless plane.
Train DDPG Agent with Pretrained Actor Network
Train a DDPG agent using an actor network that has been previously trained using supervised learning.

Train Agents for Automotive Applications

Train DQN Agent for Lane Keeping Assist
Train a DQN agent for a lane keeping assist application.
Train PPO Agent with Curriculum Learning for a Lane Keeping Application
Train a PPO agent for a lane keeping assist task by gradually increasing task complexity.
Train DDPG Agent for Adaptive Cruise Control
Train a DDPG agent for an adaptive cruise control application.
Train DDPG Agent for Path-Following Control
Train a DDPG agent for lane following control.
Train Multiple Agents for Path Following Control
Train a DQN and a DDPG agent to collaboratively perform adaptive cruise control and lane keeping assist to follow a path.
Train Hybrid SAC Agent for Path-Following Control
Train a hybrid SAC agent for lane following control.
Train Hybrid-Action PPO Agent for Path-Following Control
Train a hybrid PPO agent for lane following control.
Train PPO Agent for Automatic Parking Valet
Train a discrete action space PPO agent to park a car in an open parking space.

Contextual Bandit Problems

Train Reinforcement Learning Agent for Simple Contextual Bandit Problem
Train Q and DQN agents to solve a contextual bandit problem.
Why Solving Regression Using Reinforcement Learning is Not Recommended
Using a reinforcement learning agent to solve a regression problem is possible but not recommended.
Why Solving Classification Using Reinforcement Learning Is Not Recommended
Using a reinforcement learning agent to solve a classification problem is possible but not recommended.

Other Applications

Train Agent to Play Turn-Based Game
Train a DQN agent to play a turn-based game.
Deep Reinforcement Learning for Optimal Trade Execution
This example shows how to use the Reinforcement Learning Toolbox™ and Deep Learning Toolbox™ to design agents for optimal trade execution.
Multiperiod Goal-Based Wealth Management Using Reinforcement Learning
This example shows a reinforcement learning (RL) approach to maximize the probability of obtaining an investor's wealth goal at the end of the investment horizon.
Train DQN Agent for Beam Selection (5G Toolbox)
Train a deep Q-network (DQN) reinforcement learning agent for beam selection in a 5G new radio communications system. (Since R2022b)
Water Distribution System Scheduling Using Reinforcement Learning
Train a DQN agent to optimally activate pumps in a water distribution system.

Featured Examples

Identify Vulnerabilities in DC Microgrids

Train a TD3 agent to attack a cyber-physical system to identify vulnerabilities.

Open Live Script

Optimizing Queue Selection Strategies Using Reinforcement Learning

Train a DQN agent to optimally route customers through a multi-queue checkout system.

Open Live Script

Automatic Parking Valet with Unreal Engine Simulation

Use a TD3 agent with an MPC controller to perform a parking maneuver.

Open Live Script

Quadruped Robot Locomotion Using DDPG Agent

Train a DDPG agent to control a quadruped walking robot modeled in Simscape Multibody.

Open Live Script