Main Content

Apply Speech Command Recognition Network in Simulink

The previous example showed how to compress the speech command recognition model using pruning and quantization.

All the prior examples have performed speech command recognition in MATLAB®. This example shows how to apply audio preprocessing and speech command recognition using a deep learning network in Simulink®.

The next example shows how to integrate this Simulink speech command recognition model into a smart speaker application.

Speech Command Recognition in Simulink

This example uses a Simulink® model that detects the presence of speech commands in audio. The model uses a pretrained convolutional neural network to recognize a given set of commands.

Speech Command Recognition Model

The model recognizes these speech commands:

  • "yes"

  • "no"

  • "up"

  • "down"

  • "left"

  • "right"

  • "on"

  • "off"

  • "stop"

  • "go"

The model uses a pretrained pruned convolutional deep learning network. Refer to the example Train Deep Learning Network for Speech Command Recognition for details on the architecture of this network and how to train it. Refer to the example Prune and Quantize Speech Command Recognition Network for detials on compressing this network.

Open the model.

model = "cmdrecog";
open_system(model)

The model breaks the audio stream into one-second overlapping segments. A bark spectrogram is computed from each segment. The spectrograms are fed to the pretrained network.

Use the manual switch to select either a live stream from your microphone or read commands stored in audio files. For commands on file, use the rotary switch to select one of three commands (Go, Yes, or Stop).

Auditory Spectrogram Extraction

The deep learning network was trained on auditory spectrograms computed using an audioFeatureExtractor. The Auditory Spectrogram block in the model has been configured to extract the same features as the network was trained on.

Run the model

Simulate the model for 20 seconds.

set_param(model,StopTime="20");
open_system(model + "/Time Scope")
sim(model);

The recognized command is printed in the display block. The network activations, which give a level of confidence in the different supported commands, are displayed in a time scope.

Close the model.

close_system(model,0)

Previous Step

Prune and Quantize Speech Command Recognition Network

Next Step

Apply Speech Command Recognition Network in Smart Speaker Simulink Model

Other Things to Try

Related Topics