Faster R-CNN detector does not draw boxes with my own dataset

3 views (last 30 days)
Hello.
I trained a Faster R-CNN model based on ResNet50 with 10 classes of spectrograms from the UrbanSound8K audio files, then I tested the net with a testing dataset and the detectionResults table is an almost empty table with some rows so my detection doesn't work at all...
I attach the table detectionResults and [ap,recall,precision] obtained from a test set here.
The non-empty rows are only 'jackhammer' class detected objects, the 8th class, in fact the ap array that you can find in the .mat file I attached, is [0;0;0;0;0;0;0;0.091;0;0].
I tried before the training on a VGG16-based Faster R-CNN, and the result table was totally empty, unlike the ResNet one.
Here's my code, for the net I took the code from the MATLAB demo of creating a Faster R-CNN and I set up variables and parameters for my purposes:
clear
clc
load labeled_greyscale_datastore.mat
% Load a pretrained ResNet-50.
net = resnet50;
lgraph = layerGraph(net);
% Remove the last 3 layers.
layersToRemove = {
'fc1000'
'fc1000_softmax'
'ClassificationLayer_fc1000'
};
lgraph = removeLayers(lgraph, layersToRemove);
% Specify the number of classes the network should classify.
numClasses = 10;
numClassesPlusBackground = numClasses + 1;
% Define new classification layers.
newLayers = [
fullyConnectedLayer(numClassesPlusBackground, 'Name', 'rcnnFC')
softmaxLayer('Name', 'rcnnSoftmax')
classificationLayer('Name', 'rcnnClassification')
];
% Add new object classification layers.
lgraph = addLayers(lgraph, newLayers);
% Connect the new layers to the network.
lgraph = connectLayers(lgraph, 'avg_pool', 'rcnnFC');
% Define the number of outputs of the fully connected layer.
numOutputs = 4 * numClasses;
% Create the box regression layers.
boxRegressionLayers = [
fullyConnectedLayer(numOutputs,'Name','rcnnBoxFC')
rcnnBoxRegressionLayer('Name','rcnnBoxDeltas')
];
% Add the layers to the network.
lgraph = addLayers(lgraph, boxRegressionLayers);
% Connect the regression layers to the layer named 'avg_pool'.
lgraph = connectLayers(lgraph,'avg_pool','rcnnBoxFC');
% Select a feature extraction layer.
featureExtractionLayer = 'activation_40_relu';
% Disconnect the layers attached to the selected feature extraction layer.
lgraph = disconnectLayers(lgraph, featureExtractionLayer,'res5a_branch2a');
lgraph = disconnectLayers(lgraph, featureExtractionLayer,'res5a_branch1');
% Add ROI max pooling layer.
outputSize = [14 14];
roiPool = roiMaxPooling2dLayer(outputSize,'Name','roiPool');
lgraph = addLayers(lgraph, roiPool);
% Connect feature extraction layer to ROI max pooling layer.
lgraph = connectLayers(lgraph, featureExtractionLayer,'roiPool/in');
% Connect the output of ROI max pool to the disconnected layers from above.
lgraph = connectLayers(lgraph, 'roiPool','res5a_branch2a');
lgraph = connectLayers(lgraph, 'roiPool','res5a_branch1');
% Define anchor boxes.
anchorBoxes = [
55 55
55 112
111 112
111 55
112 112
112 111
112 55
111 111
55 111
];
% Create the region proposal layer.
proposalLayer = regionProposalLayer(anchorBoxes,'Name','regionProposal');
lgraph = addLayers(lgraph, proposalLayer);
% Number of anchor boxes.
numAnchors = size(anchorBoxes,1);
% Number of feature maps in coming out of the feature extraction layer.
numFilters = 1024;
rpnLayers = [
convolution2dLayer(3, numFilters,'padding',[1 1],'Name','rpnConv3x3')
reluLayer('Name','rpnRelu')
];
lgraph = addLayers(lgraph, rpnLayers);
% Connect to RPN to feature extraction layer.
lgraph = connectLayers(lgraph, featureExtractionLayer, 'rpnConv3x3');
% Add RPN classification layers.
rpnClsLayers = [
convolution2dLayer(1, numAnchors*2,'Name', 'rpnConv1x1ClsScores')
rpnSoftmaxLayer('Name', 'rpnSoftmax')
rpnClassificationLayer('Name','rpnClassification')
];
lgraph = addLayers(lgraph, rpnClsLayers);
% Connect the classification layers to the RPN network.
lgraph = connectLayers(lgraph, 'rpnRelu', 'rpnConv1x1ClsScores');
% Add RPN regression layers.
rpnRegLayers = [
convolution2dLayer(1, numAnchors*4, 'Name', 'rpnConv1x1BoxDeltas')
rcnnBoxRegressionLayer('Name', 'rpnBoxDeltas');
];
lgraph = addLayers(lgraph, rpnRegLayers);
% Connect the regression layers to the RPN network.
lgraph = connectLayers(lgraph, 'rpnRelu', 'rpnConv1x1BoxDeltas');
% Connect region proposal network.
lgraph = connectLayers(lgraph, 'rpnConv1x1ClsScores', 'regionProposal/scores');
lgraph = connectLayers(lgraph, 'rpnConv1x1BoxDeltas', 'regionProposal/boxDeltas');
% Connect region proposal layer to roi pooling.
lgraph = connectLayers(lgraph, 'regionProposal', 'roiPool/roi');
numClasses = 10;
analyzeNetwork(lgraph)
%%
data = read(trainCds);
I = data{1};
bbox = data{2};
annotatedImage = insertShape(I,'Rectangle',bbox);
annotatedImage = imresize(annotatedImage,2);
figure
imshow(annotatedImage)
%%
inputSize = [224 224 3];
%%
preprocessedTrainingData = transform(trainCds, @(data)preprocessData(data,inputSize));
augmentedTrainingData = transform(trainCds,@augmentData);
augmentedData = cell(4,1);
for k = 1:4
data = read(augmentedTrainingData);
augmentedData{k} = insertShape(data{1},'Rectangle',data{2});
reset(augmentedTrainingData);
end
figure
montage(augmentedData,'BorderSize',10)
%% transform my datasets
trainingData = transform(augmentedTrainingData,@(data)preprocessData(data,inputSize));
validationData = transform(validationCds,@(data)preprocessData(data,inputSize));
%% show an example
data = read(trainingData);
I = data{1};
bbox = data{2};
label = data{3};
annotatedImage = insertObjectAnnotation(I,'Rectangle',bbox,label);
annotatedImage = imresize(annotatedImage,2);
figure
imshow(annotatedImage)
%% training options
options = trainingOptions('sgdm',...
'InitialLearnRate',1e-3,...
'CheckpointPath',tempdir,...
'MaxEpochs', 7,...
'MiniBatchSize',1);
%%
[detector, info] = trainFasterRCNNObjectDetector(trainingData,lgraph,options, ...
'PositiveOverlapRange', [0.6 1], ...
'NegativeOverlapRange', [0 0.3]);
save('trainedDetector_resnet50','detector','info','lgraph');
%% TESTING WITH AN IMAGE
d = read(testCds);
img = d{1};
[bbox, score, label] = detect(detector,img,'MiniBatchSize',1);
detectedImg = insertObjectAnnotation(img,'Rectangle',bbox, label);
detectedImg = imresize(detectedImg, 2);
figure
imshow(detectedImg)
%% TESTING WITH TEST SET
testData = transform(testCds,@(data)preprocessData(data,inputSize));
detectionResults = detect(detector,testData,'MinibatchSize',5);
% ATTACHED TABLE
[ap, recall, precision] = evaluateDetectionPrecision(detectionResults,testData);
% I ATTACHED THESE THREE VARIABLES TOO IN THE FILE
%% THIS GRAPH GIVES AN ERROR: Error using plot Not enough input arguments. Error in faster_rcnn_creation (line 195)
plot(recall,precision)
figure
plot(recall,precision)
xlabel('Recall')
ylabel('Precision')
grid on
title(sprintf('Average Precision = %.2f', ap))
Please help me with this!
Thank you
  2 Comments
Brian Hemmat
Brian Hemmat on 3 Feb 2021
Hi Claudio,
What is the 'object' you're trying to detect in the spectrogram? It seems the example you're basing this code off of is about object detection in an image.
Where are you getting the ground truth bounding boxes?
Are you trying to train a system to classify sound as one of the ten labels? If so, then object detection is probably not the way to go. You might take a look at this example: Acoustic Scene Recognition using Late Fusion, or take a look at some of the example code and links here.
If you are trying to detect regions of the spectrograms (specific sounds within the spectrograms), and you have ground truth bounding boxes for training, please attach all the required code and a subset of the files so that we can walk through your code.
Claudio Eutizi
Claudio Eutizi on 3 Feb 2021
Edited: Claudio Eutizi on 3 Feb 2021
Hello and thank you for the answer.
I wanted to detect regions of the spectrograms with ground truth that contained maximum intensity area ROIs in a spectrogram, trying to label these areas with the label assigned to each audio source of the spectrograms.
It was an experiment, and it was a faliure because the detector was not able to find any box for any label.
So I managed to solve the problem simply labeling and 'boxing' the whole spectrogram dataset manually.
It took a lot of time, but now (and I stress the word 'now' because it was 10 minutes ago) I finally got some results.
I tried firstly a Faster-RCNN based on VGG16 and it didn't work, so I tried another one based on ResNet50 and it worked,(and works!!).

Sign in to comment.

Answers (1)

Madhav Thakker
Madhav Thakker on 8 Feb 2021
Hi Claudio,
For an object detection network to work, you need to have a labelled dataset. In your case, you need to have an annotated spectrograms dataset where each bounding box needs to have some distinct property of its respective class.
There are quite a few examples in MATLAB for doing the same. Once you have some dataset annotated, you can even try Automate labelling for which requires less human effort.
Hope this helps.

Products


Release

R2020b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!