Main Content

Audio Processing Using Deep Learning

Extend deep learning workflows with audio and speech processing applications

Apply deep learning to audio and speech processing applications by using Deep Learning Toolbox™ together with Audio Toolbox™.


Audio LabelerDefine and visualize ground-truth labels


audioDatastoreDatastore for collection of audio files
audioDataAugmenterAugment audio data
audioFeatureExtractorStreamline audio feature extraction
vggishFeaturesExtract VGGish features
vggishVGGish neural network
yamnetYAMNet neural network
yamnetGraphGraph of YAMNet AudioSet ontology
classifySoundClassify sounds in audio signal


Introduction to Deep Learning for Audio Applications (Audio Toolbox)

Learn common tools and workflows to apply deep learning to audio applications.

Classify Sound Using Deep Learning (Audio Toolbox)

Train, validate, and test a simple long short-term memory (LSTM) to classify sounds.

Transfer Learning with Pretrained Audio Networks (Audio Toolbox)

Use transfer learning to retrain YAMNet, a pretrained convolutional neural network (CNN), to classify a new set of audio signals.

Featured Examples