How to plot ROC curve?

I have dataset which I classified using 10 different thresholds. Then I evaluated true and false positive rate (TPR, FPR) to generate ROC curve. However, the curve looks strange. Did I evaluated the curve correctly? Below is the code which I used to generate ROC curve.
TPR=[0.214091009346534 0.231387608987612 0.265932891531049 ...
0.324782536928746 0.407704239947213 0.497932979272465 ...
0.566189022386499 0.587833185570207 0.546182718263242 ...
0.434923996561788];
FPR=[0.006017495627892 0.008669605012233 0.013377312018797 ...
0.022621821298088 0.039994426565193 0.069264094928662 ...
0.108694153334795 0.148784394110204 0.178634096117665 ...
0.194756822274831];
plot(FPR,TPR);

2 Comments

I think that the last two values are wrong. You cannot have any point in the right side of the diagonal [(0,0),(1,1)].
Cretu, please explain why believe the last two points are to the right of the 0-to-1 diagonal:

Sign in to comment.

 Accepted Answer

Thorsten
Thorsten on 25 Nov 2015

1 vote

I agree that the curves look strange. If you decrease the threshold, you cannot get a smaller true positive rate. The rate can only stay the same or increase. So your two points at the end of the curve are wrong. Also, you should vary your threshold through the full range, from max to 0, such that your curve starts from (0,0) and ends at (1,1).

9 Comments

Why do you think that my two last points are wrong? I have another statistics evaluated for this data (TypeI and TypeII error) and the values for these statistics looks logic (see below). TypeI starts almost at zero and ends at 0.93, so I thought that it is enough to have nice ROC curve. How should I select my thresholds to have the beginning of ROC curve at zero and the end at one?
TypeI=[0.001690304992780 0.005603857605008 0.016671577454581 ...
0.045346619920683 0.110468974349514 0.236186829620592 ...
0.427461429256604 0.645969927783621 0.826361530690066 ...
0.932744106170055];
TypeII=[0.929205323099390 0.837527907084626 0.688232020353751 ...
0.503233198019267 0.327662747863992 0.195276741144506 ...
0.111228585773647 0.062940711525258 0.036581507396483 ...
0.022156258145306];
Thorsten
Thorsten on 25 Nov 2015
Edited: Thorsten on 25 Nov 2015
The last two points in TPR are smaller than the last but third point. This means that you get fewer TP's for lower thresholds. That's wrong. If N points are a hit at threshold t, they are a hit a threshold t -dt and t -2*dt. So the true positive rate should be monotonically increasing for decreasing thresholds. But that's not the case in your data. To be more specific, please expand on how you determined TPR and FPR.
You can normalize the response of your operator to the range 0,1 and then you can vary the thresholds in the range 0,1. Or if you don't want to normalize, you vary the thresholds in the range Xmin, Xmax, where Xmin, Xmax is the range of your operator response.
Karolina
Karolina on 25 Nov 2015
Edited: Karolina on 25 Nov 2015
I evaluated TPR and FPR based on these formulas:
TPR=TP/TP+FN
FPR=FP/FP+TN
where TP,FN, FP, and TN are taken from https://en.wikipedia.org/wiki/Confusion_matrix
% T1 = dataset with the lowest threshold
A1=nnz(T1==3); %TP
B1=nnz(T1==4); %FP
C1=nnz(T1==6); %FN
D1=nnz(T1==8); %TN
% import A2-A10, ..., D2-D10
TPRA1=A1/(A1+C1); %True Positive Rate
FPRA1=B1/(B1+D1); %False Positive Rate
% evaluate TPRA2-TPRA10; FPRA2-FPRA10
TPR=[TPRA1 TPRA2 TPRA3 TPRA4 TPRA5 TPRA6 TPRA7 TPRA8 TPRA9 TPRA10];
FPR=[FPRA1 FPRA2 FPRA3 FPRA4 FPRA5 FPRA6 FPRA7 FPRA8 FPRA9 FPRA10];
plot(FPR,TPR);
Thorsten
Thorsten on 25 Nov 2015
Edited: Thorsten on 25 Nov 2015
You missed the parentheses TPR = TP/(TP + FN); FPR = FP/(FP + TN);
But I see that this error is not in your code.
How are the T1, ..., T10 data generated? Do you have the operator response and the ground truth data?
Karolina
Karolina on 25 Nov 2015
Edited: Karolina on 25 Nov 2015
Yes, I forgot the parentheses, but only in the post. In my code it is fine. My ground truth are data which have been manually classified and verified (reference.tif in the attached zip), the data which I am thresholding are in data.tif (this is only small part of my dataset), and the thresholds in thresholds.txt
T1-T10 are evaluated the following (I did this in another software than Matlab):
X=data.tif
Z1= if X is > 0.9 give 3 else 4
Z2= if X is > 1.0 give 3 else 4
Z3= if X is > 1.1 give 3 else 4
Z4= if X is > 1.2 give 3 else 4
Z5= if X is > 1.3 give 3 else 4
Z6= if X is > 1.4 give 3 else 4
Z7= if X is > 1.5 give 3 else 4
Z8= if X is > 1.6 give 3 else 4
Z9= if X is > 1.7 give 3 else 4
Z10= if X is > 1.8 give 3 else 4
Y=reference.tif % reference data have values 1 or 2
T1=Z1*Y
T2=Z2*Y
T3=Z3*Y
T4=Z4*Y
T5=Z5*Y
T6=Z6*Y
T7=Z7*Y
T8=Z8*Y
T9=Z9*Y
T10=Z10*Y
Here's a way to compute the ROC curve for your data:
% ground truth
GT = imread('../../Downloads/matlab/reference.tif');
GT = GT == 1; % convert to binary image
P = nnz(GT); % number of positive responses in ground truth
N = nnz(1-GT);
% responses
R = imread('../../Downloads/matlab/data.tif');
% your thresholds
thresholds = [0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8];
% alternatively, use 100 thresholds between min(R) and max(R)
% thresholds = linspace(min(R(:)), max(R(:)));
% pre-allocate for speed
tp = nan(1, length(thresholds));
fp = nan(1, length(thresholds));
for i = 1:numel(thresholds)
t = thresholds(end-i+1); % thresholds from high to low as i increases
Rt = R > t; % thresholded response
tp(i) = nnz(Rt & GT);
fp(i) = nnz(Rt & ~GT);
end
% convert to rates
TPR = tp/P;
FPR = fp/N;
plot(FPR, TPR) % ROC
Thank you! I have one more question, because after applying the script for all my dataset I have an error which is I think related to the pixels which have not data values (-3.4028235e+38). What should I do to exclude these values from evaluation. The message is:
Error using &
Matrix dimensions must agree.
Thorsten
Thorsten on 25 Nov 2015
Edited: Thorsten on 25 Nov 2015
You have to restrict all computations to the valid indices:
valid_ind = R > -3.40e+38;
P = nnz(GT(valid_ind)); % number of positive responses in ground truth
N = nnz(1-GT(valid_ind));
and in the loop
tp(i) = nnz(Rt(valid_ind) & GT(valid_ind));
fp(i) = nnz(Rt(valid_ind) & ~GT(valid_ind));
The best way to do it is to write a function
function [TPR, FPR] = roc(GT, R, thresholds)
and call this function with
GT(valid_ind), R(valid_ind)
in case you have to exclude some pixels from the analysis.
Natsu dragon
Natsu dragon on 3 Feb 2018
Edited: Natsu dragon on 3 Feb 2018
hello, i have used the same code with your attached data and i got results, but when i used it with my own results i got nothing, it didn't plots. can you help me to understand why?

Sign in to comment.

More Answers (0)

Tags

Asked:

on 25 Nov 2015

Edited:

on 3 Feb 2018

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!