Main Content

rlFunctionEnv

Specify custom reinforcement learning environment dynamics using functions

Since R2019a

Description

Use rlFunctionEnv to define a custom reinforcement learning environment. You provide MATLAB® functions that define the step and reset behavior for the environment. This object is useful when you want to customize your environment beyond the predefined environments available with rlPredefinedEnv.

Creation

Description

example

env = rlFunctionEnv(observationInfo,actionInfo,stepfcn,resetfcn) creates a reinforcement learning environment using the provided observation and action specifications, observationInfo and actionInfo, respectively. You also set the StepFcn and ResetFcn properties using MATLAB functions.

Input Arguments

expand all

Observation specifications, specified as an rlFiniteSetSpec or rlNumericSpec object or an array containing a mix of such objects. Each element in the array defines the properties of an environment observation channel, such as its dimensions, data type, and name.

You can extract observationInfo from an existing environment or agent using getObservationInfo. You can also construct the specifications manually.

Action specifications, specified either as an rlFiniteSetSpec (for discrete action spaces) or rlNumericSpec (for continuous action spaces) object. This object defines the properties of the environment action channel, such as its dimensions, data type, and name.

Note

Only one action channel is allowed.

You can extract actionInfo from an existing environment or agent using getActionInfo. You can also construct the specifications manually.

Properties

expand all

Step behavior for the environment, specified as a function name, function handle, or anonymous function.

StepFcn is a function that you provide which describes how the environment advances to the next state from a given action. When using a function name or function handle, this function must have two inputs and four outputs, as illustrated by the following signature.

[Observation,Reward,IsDone,LoggedSignals] = myStepFunction(Action,LoggedSignals)

To use additional input arguments beyond the required set, specify StepFcn using an anonymous function handle.

The step function computes the values of the observation and reward for the given action in the environment. The required input and output arguments are as follows.

  • Action — Current action, which must match the dimensions and data type specified in actionInfo.

  • Observation — Returned observation, which must match the dimensions and data types specified in observationInfo.

  • Reward — Reward for the current step, returned as a scalar value.

  • IsDone — Logical value indicating whether to end the simulation episode. The step function that you define can include logic to decide whether to end the simulation based on the observation, reward, or any other values.

  • LoggedSignals — Any data that you want to pass from one step to the next, specified as a structure.

For an example showing multiple ways to define a step function, see Create MATLAB Environment Using Custom Functions.

Reset behavior for the environment, specified as a function, function handle, or anonymous function handle.

The reset function that you provide must have no inputs and two outputs, as illustrated by the following signature.

[InitialObservation,LoggedSignals] = myResetFunction

To use input arguments with your reset function, specify ResetFcn using an anonymous function handle.

The reset function sets the environment to an initial state and computes the initial values of the observation signals. For example, you can create a reset function that randomizes certain state values, such that each training episode begins from different initial conditions.

The sim function calls the reset function to reset the environment at the start of each simulation, and the train function calls it at the start of each training episode.

The InitialObservation output must match the dimensions and data type of observationInfo.

To pass information from the reset condition into the first step, specify that information in the reset function as the output structure LoggedSignals.

For an example showing multiple ways to define a reset function, see Create MATLAB Environment Using Custom Functions.

Information to pass to the next step, specified as a structure. When you create the environment, whatever you define as the LoggedSignals output of ResetFcn initializes this property. When a step occurs, the software populates this property with data to pass to the next step, as defined in StepFcn.

Object Functions

getActionInfoObtain action data specifications from reinforcement learning environment, agent, or experience buffer
getObservationInfoObtain observation data specifications from reinforcement learning environment, agent, or experience buffer
trainTrain reinforcement learning agents within a specified environment
simSimulate trained reinforcement learning agents within specified environment
validateEnvironmentValidate custom reinforcement learning environment

Examples

collapse all

Create a reinforcement learning environment by supplying custom dynamic functions in MATLAB®. Using rlFunctionEnv, you can create a MATLAB reinforcement learning environment from an observation specification, action specification, and step and reset functions that you define.

For this example, create an environment that represents a system for balancing a cart on a pole. The observations from the environment are the cart position, cart velocity, pendulum angle, and pendulum angle derivative. (For additional details about this environment, see Create MATLAB Environment Using Custom Functions.) Create an observation specification for those signals.

oinfo = rlNumericSpec([4 1]);
oinfo.Name = "CartPole States";
oinfo.Description = 'x, dx, theta, dtheta';

The environment has a discrete action space where the agent can apply one of two possible force values to the cart, –10 N or 10 N. Create the action specification for those actions.

ActionInfo = rlFiniteSetSpec([-10 10]);
ActionInfo.Name = "CartPole Action";

Next, specify the custom step and reset functions. For this example, use the supplied functions myResetFunction.m and myStepFunction.m. For details about these functions and how they are constructed, see Create MATLAB Environment Using Custom Functions.

Construct the custom environment using the defined observation specification, action specification, and function names.

env = rlFunctionEnv(oinfo,ActionInfo,"myStepFunction","myResetFunction");

You can create agents for env and train them within the environment as you would for any other reinforcement learning environment.

As an alternative to using function names, you can specify the functions as function handles. For more details and an example, see Create MATLAB Environment Using Custom Functions.

Version History

Introduced in R2019a