lstm

Long short-term memory

Syntax

``Y = lstm(X,H0,C0,weights,recurrentWeights,bias)``
``[Y,hiddenState,cellState] = lstm(X,H0,C0,weights,recurrentWeights,bias)``
``[___] = lstm(___,'DataFormat',FMT)``

Description

The long short-term memory (LSTM) operation allows a network to learn long-term dependencies between time steps in time series and sequence data.

Note

This function applies the deep learning LSTM operation to `dlarray` data. If you want to apply an LSTM operation within a `layerGraph` object or `Layer` array, use the following layer:

example

````Y = lstm(X,H0,C0,weights,recurrentWeights,bias)` applies a long short-term memory (LSTM) calculation to input `X` using the initial hidden state `H0`, initial cell state `C0`, and parameters `weights`, `recurrentWeights`, and `bias`. The input `X` must be a formatted `dlarray`. The output `Y` is a formatted `dlarray` with the same dimension format as `X`, except for any `'S'` dimensions.The `lstm` function updates the cell and hidden states using the hyperbolic tangent function (tanh) as the state activation function. The `lstm` function uses the sigmoid function given by $\sigma \left(x\right)={\left(1+{e}^{-x}\right)}^{-1}$ as the gate activation function.```
````[Y,hiddenState,cellState] = lstm(X,H0,C0,weights,recurrentWeights,bias)` also returns the hidden state and cell state after the LSTM operation.```
````[___] = lstm(___,'DataFormat',FMT)` also specifies the dimension format `FMT` when `X` is not a formatted `dlarray`. The output `Y` is an unformatted `dlarray` with the same dimension order as `X`, except for any `'S'` dimensions.```

Examples

collapse all

Perform an LSTM operation using three hidden units.

Create the input sequence data as 32 observations with 10 channels and a sequence length of 64

```numFeatures = 10; numObservations = 32; sequenceLength = 64; X = randn(numFeatures,numObservations,sequenceLength); dlX = dlarray(X,'CBT');```

Create the initial hidden and cell states with three hidden units. Use the same initial hidden state and cell state for all observations.

```numHiddenUnits = 3; H0 = zeros(numHiddenUnits,1); C0 = zeros(numHiddenUnits,1);```

Create the learnable parameters for the LSTM operation.

```weights = dlarray(randn(4*numHiddenUnits,numFeatures),'CU'); recurrentWeights = dlarray(randn(4*numHiddenUnits,numHiddenUnits),'CU'); bias = dlarray(randn(4*numHiddenUnits,1),'C');```

Perform the LSTM calculation

`[dlY,hiddenState,cellState] = lstm(dlX,H0,C0,weights,recurrentWeights,bias);`

View the size and dimensions of `dlY`.

`size(dlY)`
```ans = 1×3 3 32 64 ```
`dlY.dims`
```ans = 'CBT' ```

View the size of `hiddenState` and `cellState`.

`size(hiddenState)`
```ans = 1×2 3 32 ```
`size(cellState)`
```ans = 1×2 3 32 ```

Check that the output `hiddenState` is the same as the last time step of output `dlY`.

```if extractdata(dlY(:,:,end)) == hiddenState disp("The hidden state and the last time step are equal."); else disp("The hidden state and the last time step are not equal.") end```
```The hidden state and the last time step are equal. ```

You can use the hidden state and cell state to keep track of the state of the LSTM operation and input further sequential data.

Input Arguments

collapse all

Input data, specified as a formatted `dlarray`, an unformatted `dlarray`, or a numeric array. When `X` is not a formatted `dlarray`, you must specify the dimension label format using `'DataFormat',FMT`. If `X` is a numeric array, at least one of `H0`, `C0`, `weights`, `recurrentWeights`, or `bias` must be a `dlarray`.

`X` must contain a sequence dimension labeled `'T'`. If `X` has any spatial dimensions labeled `'S'`, they are flattened into the `'C'` channel dimension. If `X` does not have a channel dimension, then one is added. If `X` has any unspecified dimensions labeled `'U'`, they must be singleton.

Data Types: `single` | `double`

Initial hidden state vector, specified as a formatted `dlarray`, an unformatted `dlarray`, or a numeric array.

If `H0` is a formatted `dlarray`, it must contain a channel dimension labeled `'C'` and optionally a batch dimension labeled `'B'` with the same size as the `'B'` dimension of `X`. If `H0` does not have a `'B'` dimension, the function uses the same hidden state vector for each observation in `X`.

The size of the `'C'` dimension determines the number of hidden units. The size of the `'C'` dimension of `H0` must be equal to the size of the `'C'` dimensions of `C0`.

If `H0` is a not a formatted `dlarray`, the size of the first dimension determines the number of hidden units and must be the same size as the first dimension or the `'C'` dimension of `C0`.

Data Types: `single` | `double`

Initial cell state vector, specified as a formatted `dlarray`, an unformatted `dlarray`, or a numeric array.

If `C0` is a formatted `dlarray`, it must contain a channel dimension labeled `'C'` and optionally a batch dimension labeled `'B'` with the same size as the `'B'` dimension of `X`. If `C0` does not have a `'B'` dimension, the function uses the same cell state vector for each observation in `X`.

The size of the `'C'` dimension determines the number of hidden units. The size of the `'C'` dimension of `C0` must be equal to the size of the `'C'` dimensions of `H0`.

If `C0` is a not a formatted `dlarray`, the size of the first dimension determines the number of hidden units and must be the same size as the first dimension or the `'C'` dimension of `H0`.

Data Types: `single` | `double`

Weights, specified as a formatted `dlarray`, an unformatted `dlarray`, or a numeric array.

Specify `weights` as a matrix of size `4*NumHiddenUnits`-by-`InputSize`, where `NumHiddenUnits` is the size of the `'C'` dimension of both `C0` and `H0`, and `InputSize` is the size of the `'C'` dimension of `X` multiplied by the size of each `'S'` dimension of `X`, where present.

If `weights` is a formatted `dlarray`, it must contain a `'C'` dimension of size `4*NumHiddenUnits` and a `'U'` dimension of size `InputSize`.

Data Types: `single` | `double`

Recurrent weights, specified as a formatted `dlarray`, an unformatted `dlarray`, or a numeric array.

Specify `recurrentWeights` as a matrix of size `4*NumHiddenUnits`-by-`NumHiddenUnits`, where `NumHiddenUnits` is the size of the `'C'` dimension of both `C0` and `H0`.

If `recurrentWeights` is a formatted `dlarray`, it must contain a `'C'` dimension of size `4*NumHiddenUnits` and a `'U'` dimension of size `NumHiddenUnits`.

Data Types: `single` | `double`

Bias, specified as a formatted `dlarray`, an unformatted `dlarray`, or a numeric array.

Specify `bias` as a vector of length `4*NumHiddenUnits`, where `NumHiddenUnits` is the size of the `'C'` dimension of both `C0` and `H0`.

If `bias` is a formatted `dlarray`, the nonsingleton dimension must be labeled with `'C'`.

Data Types: `single` | `double`

Dimension order of unformatted input data, specified as the comma-separated pair consisting of `'DataFormat'` and a character array or string `FMT` that provides a label for each dimension of the data. Each character in `FMT` must be one of the following:

• `'S'` — Spatial

• `'C'` — Channel

• `'B'` — Batch (for example, samples and observations)

• `'T'` — Time (for example, sequences)

• `'U'` — Unspecified

You can specify multiple dimensions labeled `'S'` or `'U'`. You can use the labels `'C'`, `'B'`, and `'T'` at most once.

You must specify `'DataFormat',FMT` when the input data is not a formatted `dlarray`.

Example: `'DataFormat','SSCB'`

Data Types: `char` | `string`

Output Arguments

collapse all

LSTM output, returned as a `dlarray`. The output `Y` has the same underlying data type as the input `X`.

If the input data `X` is a formatted `dlarray`, `Y` has the same dimension format as `X`, except for any `'S'` dimensions. If the input data is not a formatted `dlarray`, `Y` is an unformatted `dlarray` with the same dimension order as the input data.

The size of the `'C'` dimension of `Y` is the same as the number of hidden units, specified by the size of the `'C'` dimension of `H0` or `C0`.

Hidden state vector for each observation, returned as a `dlarray` or a numeric array with the same data type as `H0`.

If the input `H0` is a formatted `dlarray`, then the output `hiddenState` is a formatted `dlarray` with the format `'CB'`.

Cell state vector for each observation, returned as a `dlarray` or a numeric array. `cellState` is returned with the same data type as `C0`.

If the input `C0` is a formatted `dlarray`, the output `cellState` is returned as a formatted `dlarray` with the format `'CB'`.

Limitations

• `functionToLayerGraph` does not support the `lstm` function. If you use `functionToLayerGraph` with a function that contains the `lstm` operation, the resulting `LayerGraph` contains placeholder layers.

The LSTM operation allows a network to learn long-term dependencies between time steps in time series and sequence data. For more information, see the definition of Long Short-Tem Memory Layer on the `lstmLayer` reference page.