Clear Filters
Clear Filters

wrong values in histogram plotting

15 views (last 30 days)
Hello,
I'm trying to plot a histogram of an array. I have a csv file with a list of double values, and I want to see how many elements have a value that is less or equal to 10% of the maximal value, 20%, 30% and etc.. I tried using the following code, but I get wrong statistics, when I check how many elements have a lesser or equal value to 10% of the maximal element, I see that there are 11173940 such elements. I did so by using the following code:
maxElement = max(array);
elementCount = sum(array < maxElement * 0.1);
when I print the histogram it shows like there are less than 180 elements that constitute this condition. this is the code I used (I have a lot of csv files that I want to read and analyze in the same manner, that's why the filename loop):
clear; clc;
dataDir = 'hist_res_rel';
fileList = dir(strcat(dataDir, '/*.csv'));
plotDir = 'plot_dir_rel';
for i = 1:numel(fileList)
fileName = fileList(i).name;
epoch = fileName(length(fileName)-5:length(fileName)-4);
if contains(fileName,'a_rel')
plot_title = strcat('A Realtive Value Change Between Epochs: ', epoch, '-', num2str(str2double(epoch)+10));
end
if contains(fileName,'b_rel')
plot_title = strcat('B Realtive Value Change Between Epochs: ', epoch, '-', num2str(str2double(epoch)+10));
end
rel_val = readmatrix(strcat(dataDir, fileName));
rel_val = abs(rel_val);
Max = max(rel_val);
p = 0.1;
x = zeros(10, 1);
y = zeros(10, 1);
for index = 1:10
percentage = Max * p;
x(index) = percentage;
if index == 1
y(index) = sum(rel_val <= x(index));
else
y(index) = sum(rel_val <= x(index) & rel_val > x(index-1));
end
p = p + 0.1;
end
f = histogram(rel_val, x);
xticks(x);
title(plot_title);
xlabel('Percantage of Relative Change');
ylabel('Amount of Parameters');
xticklabels({'0', '10','20','30', '40', '50', '60', '70', '80', '90', '100'});
saveas(f, strcat(plotDir, '/plot_', fileName(1:length(fileName)-3), '.jpg'));
end
this is the histogram that I get:
and this is the csv file that I'm trying to analyze just to make sure everything works (sorry, it's so large I had to use an external site for the upload):
Thank you so much for your time and attention, I appreciate your help.

Accepted Answer

Ganesh
Ganesh on 27 Dec 2023
I understand that your histogram is inconsistent with the data you have. The issue you are facing can be easily resolved by adding 0 at the start of the variable "x".
When using a histogram, the histogram calculates the number of data points between edges. As your variable "x" begins with Max*0.1, the histogram plots interval between Max*0.1 and Max*0.2 and so on. By adding 0 at the start you can make the first edge to be 0, Max*0.1, which will give you the right result.
x = [0;x] % Add this line before plotting the histogram
Kindly refer to the following document for more information and examples on using the "histogram()" function:
Hope this helps!
  1 Comment
Elinor Ginzburg
Elinor Ginzburg on 27 Dec 2023
Thank you so much! yeah that was the problem, I don't use Matlab so often, so I'm a bit rusty. Thanks again!

Sign in to comment.

More Answers (0)

Products


Release

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!