rlBehaviorCloningRegularizerOptions
Description
Use an rlBehaviorCloningRegularizerOptions
object to specify
behavioral cloning regularizer options to train a DDPG, TD3, or SAC agent. The only option you
can specify is the regularizer weight, which balances the actor loss with the behavioral
cloning penalty and is mostly useful to train agents offline (specifically to deal with
possible differences between the probability distribution of the dataset and the one generated
by the environment). To enable the behavioral cloning regularizer when training an agent, set
the BatchDataRegularizerOptions
property of the agent options object to
an rlBehaviorCloningRegularizerOptions
object that has your preferred
regularizer weight.
Creation
Syntax
Description
returns a default behavioral cloning regularizer options set.bcOpts
= rlBehaviorCloningRegularizerOptions
creates the behavioral cloning regularizer option set bcOpts
= rlBehaviorCloningRegularizerOptions(Name=Value
)bcOpts
and sets
its properties using one or more name-value arguments.
Properties
Object Functions
Examples
References
[1] Fujimoto, Scott, and Shixiang Shane Gu. "A minimalist approach to offline reinforcement learning." Advances in Neural Information Processing Systems 34 (2021): 20132-20145.
Version History
Introduced in R2023a