Sample Data Sets for Shallow Neural Networks
The Deep Learning Toolbox™ contains a number of sample data sets that you can use to experiment with shallow neural networks. To view the data sets that are available, use the following command:
help nndatasets
Neural Network Datasets ----------------------- Function Fitting, Function approximation and Curve fitting. Function fitting is the process of training a neural network on a set of inputs in order to produce an associated set of target outputs. Once the neural network has fit the data, it forms a generalization of the input-output relationship and can be used to generate outputs for inputs it was not trained on. simplefit_dataset - Simple fitting dataset. abalone_dataset - Abalone shell rings dataset. bodyfat_dataset - Body fat percentage dataset. building_dataset - Building energy dataset. chemical_dataset - Chemical sensor dataset. cho_dataset - Cholesterol dataset. engine_dataset - Engine behavior dataset. vinyl_dataset - Vinyl bromide dataset. ---------- Pattern Recognition and Classification Pattern recognition is the process of training a neural network to assign the correct target classes to a set of input patterns. Once trained the network can be used to classify patterns it has not seen before. simpleclass_dataset - Simple pattern recognition dataset. cancer_dataset - Breast cancer dataset. crab_dataset - Crab gender dataset. glass_dataset - Glass chemical dataset. iris_dataset - Iris flower dataset. ovarian_dataset - Ovarian cancer dataset. thyroid_dataset - Thyroid function dataset. wine_dataset - Italian wines dataset. digitTrain4DArrayData - Synthetic handwritten digit dataset for training in form of 4-D array. digitTrainCellArrayData - Synthetic handwritten digit dataset for training in form of cell array. digitTest4DArrayData - Synthetic handwritten digit dataset for testing in form of 4-D array. digitTestCellArrayData - Synthetic handwritten digit dataset for testing in form of cell array. digitSmallCellArrayData - Subset of the synthetic handwritten digit dataset for training in form of cell array. ---------- Clustering, Feature extraction and Data dimension reduction Clustering is the process of training a neural network on patterns so that the network comes up with its own classifications according to pattern similarity and relative topology. This is useful for gaining insight into data, or simplifying it before further processing. simplecluster_dataset - Simple clustering dataset. The inputs of fitting or pattern recognition datasets may also clustered. ---------- Input-Output Time-Series Prediction, Forecasting, Dynamic modeling Nonlinear autoregression, System identification and Filtering Input-output time series problems consist of predicting the next value of one time series given another time series. Past values of both series (for best accuracy), or only one of the series (for a simpler system) may be used to predict the target series. simpleseries_dataset - Simple time series prediction dataset. simplenarx_dataset - Simple time series prediction dataset. exchanger_dataset - Heat exchanger dataset. maglev_dataset - Magnetic levitation dataset. ph_dataset - Solution PH dataset. pollution_dataset - Pollution mortality dataset. refmodel_dataset - Reference model dataset robotarm_dataset - Robot arm dataset valve_dataset - Valve fluid flow dataset. ---------- Single Time-Series Prediction, Forecasting, Dynamic modeling, Nonlinear autoregression, System identification, and Filtering Single time series prediction involves predicting the next value of a time series given its past values. simplenar_dataset - Simple single series prediction dataset. chickenpox_dataset - Monthly chickenpox instances dataset. ice_dataset - Global ice volume dataset. laser_dataset - Chaotic far-infrared laser dataset. oil_dataset - Monthly oil price dataset. river_dataset - River flow dataset. solar_dataset - Sunspot activity dataset
Notice that all of the data sets have file names of the form
name_dataset
. Inside these files will be the arrays
nameInputs
and nameTargets
. You can load a data set into
the workspace with a command such as
load simplefit_dataset
This will load simplefitInputs
and simplefitTargets
into the workspace. If you want to load the input and target arrays into different names, you can
use a command such as
[x,t] = simplefit_dataset;
This will load the inputs and targets into the arrays x
and
t
. You can get a description of a data set with a command such as
help maglev_dataset
See Also
Neural Net Fitting | Neural Net Clustering | Neural Net Pattern Recognition | Neural Net Time Series