Transfer network from alexnet doesn't learn parameters
5 views (last 30 days)
Show older comments
I trained a convolutional neural network using a transfer learning approach using near identical code to the transfer learning tutorial:
net = alexnet;
Dataset = imageDatastore('[Path redacted]','IncludeSubfolders',true,'LabelSource','foldernames');
layersTransfer = net.Layers(1:end-3);
numClasses = numel(categories(Dataset.Labels));
layers = [...
layersTransfer
fullyConnectedLayer(numClasses,'WeightLearnRateFactor',20,'BiasLearnRateFactor',20)
softmaxLayer
classificationLayer];
options = trainingOptions('sgdm',...
'LearnRateSchedule','piecewise',...
'LearnRateDropFactor',0.1,...
'LearnRateDropPeriod',5,...
'MiniBatchSize',200,...
'MaxEpochs',20,...
'InitialLearnRate',0.01,...
'ExecutionEnvironment','parallel');
[Train, Test]=splitEachLabel(Dataset,0.8);
tic
netTransfer = trainNetwork(Train,layers,options);
toc
The network trained successfully, plateauing at 55% accuracy. The problem I'm solving is difficult and there is minimal class separation, so I was anticipating some difficulty. While troubleshooting, I looked at my ratio of class1 to class2 and found it to be 55%. I reapplied the network to a small test set and found that it output class 1 only. I then attempted to figure out what features it was learning for class 1 and class 2, so I implemented the deepDream script and got out an array of NaNs for both classes. I then checked the activation pattern in the first convolutional layer and the output here is blank as well. I applied identical code to alexnet (changing netTransfer to net) and was able to create deepDream images and check activation in one of the layers, indicating to me that the issue occurred while training the network.
I'm curious if anyone knows WHY this is happening. I would imagine even in a condition where there is minimal class separation the network would still fit to the data. I was anticipating similar images output from deepDream for each class, not NaNs. I'm fairly new at training these networks, I'm unsure how to troubleshoot further.
0 Comments
Answers (1)
Greg Heath
on 1 Aug 2017
If it is a simple matter of unbalanced class size,
1. Repeat enough of the smaller class data points to make the classes equally sized. Then add a SMALL amount of random noise to the simulated points.
2. An alternate approach is to place simulated data points randomly between two original data points of the same class
3. I find this is a lot easier than using prior probabilities and classification cost matrices as explained in classification text books and in at least one of my NEWSREADER tutorial posts.
4. However, if the smaller design classes naturally occur less often, prior probabilities and classification costs can be applied to the estimated output probabilities.
Hope this helps
Thank you for formally accepting my answer
Greg
0 Comments
See Also
Categories
Find more on Deep Learning Toolbox in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!