What is the best CNN for a small dataset?

2 views (last 30 days)
I have a dataset of around 370 images of people, and I want to classify their expressions. Should I build my CNN from scratch? How many hidden layers should I aim for?

Accepted Answer

Alpha Bravo
Alpha Bravo on 2 Jul 2018
Edited: Alpha Bravo on 2 Jul 2018
Building a cnn from scratch isn't too hard. How many expressions are you trying to classify? With only 370 images, even with image augmentation, you won't be able to build a very big dataset, and it will be difficult to classify many different expression types. You will need a small CNN to start with, with only a couple of layers, otherwise you will get overfitting like there is no tomorrow (that is, your network will memorize all the images you gave it, rather than learning to tell the difference).
Here is how I would go about doing this. Let's say you want to tell the difference between happy and sad expressions. Organize your photos so they are in two folders, one for each type of expression. You will need to put them there manually. As there are only 370 images this won't be too hard (for my project I've been having to classify over 14,000 images and I've got a lot more to do still, automating this process on more images you might come across is a topic for another time). Now, here is some basic code which will help you. Note that you will need the Neural Network Toolbox and preferably the Parallel Processing Toolbox (optional, but strongly recommended).
You will either need to resize all of your images to the same resolution or use the augmentedImageDatastore to do it for you, but note that the augmentedImageDatastore is a bit slower because it has to resize images on the fly, rather than just reading them from the disk. The below example will assume you want to use the augmentedImageDatastore, but it is easy to take out if you want to.
I forgot what doesn't work on R2017b vs R2018a so let me know if the below code doesn't work.
% first, the hyperparameters, you will need to play with these
train_percent = 0.90; % amount from each label to use in training, if you want to do cross validation let me know
% test perc = 1 - train
mini_batch = 256; % more is faster, limited by gpu memory if you have a gpu
max_epochs = 30;
initial_learn_rate = 0.001;
learn_rate_drop_period = 30; % reduce this if you decide you want to drop the learn rate
learn_rate_drop_factor = 0.1;
momentum = 0.9;
l2reg = 0.00001;
validation_freq = 256; % in iterations
validation_patience = 3; % early stopping
verbose = false;
augmentedResolution = [128 128]; % or whatever image resolution you want to use
inputResolution = augmentedResolution;
inputResolution(3) = 3; % color dimension, set to 1 for black/white images
layers = [imageInputLayer(inputResolution);
convolution2dLayer(3,32,'Stride',1,'Padding',1); % first number is color dimension, second number is number of neurons/ filters to use (this can be set to whatever positive integer you want, more requiring more computations and memory and parameters - so more means greater chance of overfitting), stride and padding set to 1 ensure that the output resolution is the same as the input resolution
batchNormalizationLayer(); % if r2017b supports it
reluLayer();
maxPooling2dLayer(2,'Stride',2);
dropoutLayer(0.5); % reduces overfitting
fullyConnectedLayer(2); % set "2" to number of classes
softmaxLayer();
classificationLayer()];
augmenter = imageDataAugmenter('RandRotation', [-10 10]); % optional, used to augment data, see documentation for full options
% now getting everything in place and ready to run
datastore = imageDatastore(fullfile('.'), 'IncludeSubFolders', true, 'LabelSource', 'foldernames'); % you need to run Matlab from where you folders are located for this to work
[trainStore, validStore] = splitEachLabel(datastore, train_percent);
trainStoreAug = augmentedImageDatastore(augmentedResolution, trainStore, 'DataAugmentation', augmenter);
options = trainingOptions('sgdm', 'MiniBatchSize', mini_batch_size, ...
'LearnRateSchedule', 'piecewise', 'MaxEpochs', max_epochs, 'InitialLearnRate', initial_learn_rate, ...
'LearnRateDropPeriod', learn_rate_drop_period, ...
'LearnRateDropFactor', learn_rate_drop_factor, ...
'L2Regularization', l2reg, 'Momentum', momentum, ...
'Verbose', verbose, 'VerboseFrequency', validation_freq, ...
'ValidationFrequency', validation_freq, 'ValidationData', validStore, ...
'ValidationPatience', validation_patience, 'Plots', 'training-progess');
convnet = trainNetwork(trainStoreAug, layers, options);
  2 Comments
Suparna Kumar
Suparna Kumar on 3 Jul 2018
Hey thanks for answering in great detail. I'm looking to classify the images into 4 expressions, and the number of images per expression aren't the same. For example, I've got 125 happy images and 35 surprised images. Thanks for suggesting what code to use. I had actually used 3 convolution layers and ended up with a lot of overfitting.
Alpha Bravo
Alpha Bravo on 6 Jul 2018
With an imablanced dataset, things get a bit trickier. In your example, if you had 160 images, you could classify all of them as happy and get 78% accuracy. I do bootstrap aggregation (bagging) to handle this problem, which has the fortunate side effect of helping to reduce overfitting, too.
I also combine this with cross validation so that I'm not constantly overfitting to the same validation set (plus I also get a testing set that I keep in reserve for the very end when I think I'm done with parameter tuning). You may not want all those bells and whistles, but you will definitely want some way of handling the imbalance problem.
Below is some extra code you can use to perform a single bootstrap sampling on a datastore which will balance out the classes. It works by repeatedly sampling with replacement, weighted according to the class imbalance.
trainStore = shuffle(trainStore); % i forgot to add the shuffle in the answer before
bootstrap_factor = 1; % how big do you want the new, balanced datastore to be, as a multiple of the size of the trainStore
alphabetical_labels = {'happy', 'sad'}; % labels in alphabetical order, to map label names to their indices, if using the foldernames as labels
labels = trainStore.Labels;
labelCounts = countEachLabel(trainStore);
labelCounts = labelCounts.Count;
weights = labelCounts/sum(labelCounts);
weights = weights.^(-1); % so less is more
weightVec = [];
for lab = 1:length(labels)
for labidx = 1:length(alphabetical_labels)
if labels(lab) == alphabetical_labels(labidx)
weightVec(lab) = weights(labidx);
end
end
end
trainFiles = trainStore.Files;
bootstrapSize = round(length(trainFiles) * bootstrap_factor);
Bootstrap = datasample(trainFiles, bootstrapSize, 'Weights', weightVec);
bootStrapTrainStore = imageDatastore(Bootstrap, 'LabelSource', 'foldernames', IncludeSubfolders', true);

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!