Help on Making an Accurate Confusion Matrix.

5 views (last 30 days)
I am going to submit the code below for a confusion matrix that I have made. (Okay, with some help from another.) When I run the code with my deep learning program, it seems that I get more misses than hits. In other words, I have a lot of entries for "U," which is an unknown variable, a "miss" in other words. I have 154 data files in which the computer is to learn and distinguish between Object A and Object B. What am I doing wrong? I do not think that the data is incorrect or needs to be massaged more. I think the error is in the confusion matrix that I have made. Help! I am confused about the confusion matrix. Can you have a look at the code and possibly help? Here is the code below. Many thanks!
%% Section 12: Create Confusion Matrix Chart
% From the MathWorks Help Center web-site under "confusionchart."
% Description
% confusionchart(trueLabels,predictedLabels) creates a confusion matrix chart from true labels trueLabels and
% predicted labels predictedLabels and returns a ConfusionMatrixChart object.
% The rows of the confusion matrix correspond to the true class and the columns correspond to the predicted class.
% Diagonal and off-diagonal cells correspond to correctly and incorrectly classified observations, respectively.
% Use cm to modify the confusion matrix chart after it is created.
% For a list of properties, see ConfusionMatrixChart Properties.
% Create Confusion Matrix Chart.
% Load a sample of predicted and true labels for a classification problem.
% trueLabels is the true labels for an image classification problem and
% predictedLabels is the predictions of a convolutional neural network.
trueLabels = dsTest.UnderlyingDatastores{1,2}.LabelData(:,2);
newTrueLabels = {};
for idx = 1:numel(trueLabels)
if trueLabels{idx} == 'Van'
newTrueLabels{idx} = 'V';
elseif trueLabels{idx} == 'Man'
newTrueLabels{idx} = 'M';
else
newTrueLabels{idx} = 'U'; % U means 'Unknown'. That is, it is a miss and neither 'Van' nor 'Man'.
end
end
predictedLabels = results.Labels;
newPredictedLabels = {};
for idx = 1:numel(predictedLabels)
if predictedLabels{idx} == 'Van'
newPredictedLabels{idx} = 'V';
elseif predictedLabels{idx} == 'Man'
newPredictedLabels{idx} = 'M';
else
newPredictedLabels{idx} = 'U'; % U means 'Unknown'. That is, it is a miss and neither 'Van' nor 'Man'.
end
end
% Create a confusion matrix chart.
figure(1)
cm = confusionchart(newTrueLabels,newPredictedLabels);
% Modify the appearance and behavior of the confusion matrix chart by changing property values.
% Add column and row summaries and a title.
% A column-normalized column summary displays the number of correctly and incorrectly classified observations
% for each predicted class as percentages of the number of observations of the corresponding predicted class.
% A row-normalized row summary displays the number of correctly and incorrectly classified observations
% for each true class as percentages of the number of observations of the corresponding true class.
cm.ColumnSummary = 'column-normalized';
cm.RowSummary = 'row-normalized';
cm.Title = 'Confusion Matrix for Man and Van Data';
disp("End of Section 12");
disp(" ");

Accepted Answer

ProblemSolver
ProblemSolver on 28 Jun 2023
Based on the provided code, it seems that the issue lies in how you are assigning the 'Unknown' class ('U') to the misclassified samples. The problem arises from using the equality operator (`==`) for string comparison in MATLAB. In MATLAB, when comparing strings, you should use the `strcmp` function instead of the equality operator (`==`). The `strcmp` function compares two strings and returns a logical value indicating whether they are equal.
To fix the issue, modify the code where you compare the labels as follows:
if strcmp(trueLabels{idx}, 'Van')
newTrueLabels{idx} = 'V';
elseif strcmp(trueLabels{idx}, 'Man')
newTrueLabels{idx} = 'M';
else
newTrueLabels{idx} = 'U'; % U means 'Unknown'. That is, it is a miss and neither 'Van' nor 'Man'.
end
And similarly for the `predictedLabels` comparison:
if strcmp(predictedLabels{idx}, 'Van')
newPredictedLabels{idx} = 'V';
elseif strcmp(predictedLabels{idx}, 'Man')
newPredictedLabels{idx} = 'M';
else
newPredictedLabels{idx} = 'U'; % U means 'Unknown'. That is, it is a miss and neither 'Van' nor 'Man'.
end
By using `strcmp` instead of the equality operator, you ensure that the comparison between strings is done correctly, which should resolve the issue of having more 'Unknown' ('U') entries in your confusion matrix.
  7 Comments
Bradley Evans
Bradley Evans on 29 Jun 2023
Hello, Problrem Solver,
I will send you thirty pieces of my data out of 154 pieces of data, total. They are of just two moving images in a video sequence. The first moving image is given by the four data points representing a bounding box in MATLAB. If there are eight data points, then they are separated by a semi-colon, and the second set of four data points represents the bounding box of a second moving image. Both images move in a series of video clips that I have, which make up a coarse "movie." What am I doing wrong?
If there is only one image, should I put the data points ";0, 0, 0, 0" behind it? But, there is no image for this second image, so why have any data points behind the first at all? These points are processed with a YOLO v2 deep learning subroutine program in MATLAB, and the outputs are hence fed into the "Confusion Chart." (I called it "Confusion Matrix" previously, but I do think that "Confusion Chart" is more accurate.) Can you help with this?
Behind this data are thirty bitmap (.BMP) images, for which the bounding box data were made. Below are thirty points of the (154) sample data. It is out of the 154 total, bounding box data points that I have made. If you need some more information, please do let me know. Maybe we are both getting somewhere on this. Thanks for your help thus far.
T011 = table({[596 212 44 65]}, {Classes(1)});
T012 = table({[459.5000 213 181 77]}, {Classes(1)});
T013 = table({[213 211 181 75]}, {Classes(1)});
T014 = table({[1 207 178 75]}, {Classes(1)});
T015 = table({[2 205 37 71]}, {Classes(1)});
T016 = table({[2 202 85 67]}, {Classes(1)});
T017 = table({[83 203 157 68]}, {Classes(1)});
T018 = table({[274 207 156 68]}, {Classes(1)});
T019 = table({[431 209 154 70]}, {Classes(1)});
T020 = table({[511 210 103 74]}, {Classes(1)});
T021 = table({[489 210 101 74]}, {Classes(1)});
T022 = table({[468 211 121 74]}, {Classes(1)});
T023 = table({[479 212 109 72;457 219 28 62]}, {Classes});
T024 = table({[479 210 109 75;438 219 33 62]}, {Classes});
T025 = table({[479 212 110 73;435 216 28 67]}, {Classes});
T026 = table({[483 210 104 74;453 218 35 67]}, {Classes});
T027 = table({[480 210 108 74;507 218 24 66]}, {Classes});
T028 = table({[478 210 109 73;554 220 33 65]}, {Classes});
T029 = table({[478 212 109 72;599 220 32 66]}, {Classes});
T030 = table({[478 211 110 74;609 220 28 64]}, {Classes});
T031 = table({[478 211 110 73;588 218 25 64]}, {Classes});
T032 = table({[478 210 111 75;603 220 28 62]}, {Classes});
T033 = table({[480 212 109 72;600 221 26 65]}, {Classes});
T034 = table({[478 211 107 74;566 221 28 65]}, {Classes});
T035 = table({[479 211 110 73;516 221 29 62]}, {Classes});
T036 = table({[485 212 103 74;461 218 32 68]}, {Classes});
T037 = table({[480 212 108 73;428 219 29 64]}, {Classes});
T038 = table({[478 211 109 73;435 218 28 64]}, {Classes});
T039 = table({[480 211 107 74;474 221 15 63]}, {Classes});
T040 = table({[471 211 118 73]}, {Classes(1)});
Thanks for reviewing/looking at this!
:)
ProblemSolver
ProblemSolver on 5 Jul 2023
Correct me, if my understanding diverges:
  1. you are providing the data points stored in a series of tables named T111, ...
  2. Each table contains the bounding box data points that are represented through {[]}.
  3. The second column appears to be reference of Classes variable. Is this variable defining a class or its label associated with each bounding box? Therefore, I need some clarification on the definition of the said Classes variable.
  4. Now, if you only one image, there is no need to add "0, 0, 0, 0" behind it unless there is a specific reason you want to do so.
  5. If there if no second image or bounding box, you can simply omit the data points for the second images, you don't to create a placeholder.
  6. Since I haven't really worked with the YOLO v2 deep learning subroutine, I cannot comment on it, unless I details such as how your want your Confusion Matrix (or Confusion Chart) be structured. Along with it I need to know how these images are stored or processed as it is unclear.
If it is possible, show a sample illustration for what you are exactly looking for to better help you.

Sign in to comment.

More Answers (2)

Bradley Evans
Bradley Evans on 6 Jul 2023
Thank you for your reply, yet again. Let me answer your questions as outlined above.
1) There are 154 tables named T001 ... to ... T154. I just gave you a small sample.
2) Yes. There are 154 images to go along with these data points. These images of a "Van" and "Man" are accessed in the program, also.
3) It's a class, I do believe, as this is a multi-class detection program. There are only two classes: 1) Van, and 2) Man, to keep things simple. "Classes" only represents either a "Van" or a "Man."
4) Okay, thanks. Yes, it is redundant information to add the "0, 0, 0, 0." Got it.
5) Right. Thanks. Got it. No placeholder. Silly me.
6) The program that I am running is very close to this:
Let's just say that the program that I am running was "adapted" from this MathWorks model.
I can send you the whole program via e-mail to look at, if you would like. I have spent a lot of time on this, but as the Confusion Chart above shows, there are 31 "misses" for our "test set." Yuck! We do not seem to be having much luck with all of this. Is it our program or our data? We tried to run the program with 93 data points, but we obtained similar results. We don't know what to do, otherwise than say "Something is wrong." Now, to find out exactly what is wrong. Is it the data, the program, the Confusion Chart? We are "up a creek with out a paddle," as the old saying goes.
We want our output to look like how a Confusion Chart should. As it is, it appears that none of our data coincides. This is what a Confusion Chart is supposed to show: the correct predictions made by the computer. Argh! Help!
If you would like to take this conversation off-line, and it might be preferable this way, perhaps I could send you my program. If you would like to do this, then please e-mail me at <bradleyevans3@gmail.com>. I guess the appropriate word about all of this is "trust." I can't think of a better way, except through this forum, which is public, long, and tedious. A personal e-mail connection, I think, would help. I also live in Albuquerque, New Mexico, in case you would like to meet sometime.
Help! Thanks! I appreciate all of your efforts, Mr ProblemSolver.
BE
:)

Bradley Evans
Bradley Evans on 7 Jul 2023
Edited: Bradley Evans on 7 Jul 2023
Hello Mr ProblemSolver,
I am sending you one image of the 154 images that we are working with. It is a screenshot.
Deep learning is a confusing topic. I wish I could provide you with some more information, but that's about as much as I can give for you for now, save to send you the whole program with the necessary files. I have given you an e-mail address above to contact me, if you would like to go in that direction. Sorry if this is a bit froward, but what can we do? It's too much talk over a public venue, so i would rather continue off-line with this venture. Maybe we can post something if we can figure something out.
It''s hard to say if anyone knows anything about deep learning. If it works, fine; If it doesn't work, as in this case, then we are all left in a quandary. I will send you an image file, so you have a better understanding of the files that I am looking at. Besides the MathWorks web-site I have given you as a lead, that's about it. Otherwise, we have fine-tuned our program a little bit different than the MathWoks link above, but we are not getting good results. Hmmm .,... What to do next, i do not know, save to ask you. That's about the long and short of it. Thanks.
Please reply if you still do have the time and interest. The above e-mail works. Don't worry; This materuial is ot classified or anything like that, I have been told.
Thank you, Mr ProblemSolver.
BE
:)

Products


Release

R2023a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!