How to plot a smooth curve for Empirical probability density ?

20 views (last 30 days)
Hi everyone,
I require to plot an empirical probability density curve of a data set (184 rows, 59 columns). Initially, I pick one column to plot the results but, the EPD curve looks so wired. May someone suggest to me how can I get desired result (Attached). Data is also attached for reference.
clear all
clc
X = load('R_0.01T.csv'); % input data
SzX = size(X) % matrix size
r=[X(1,:)]; % first row of the data set
ra = r(~isnan(r)); % remove all nan enteries
[f,x,flo,fhi] = ecdf(ra);
WinLen = 5
dfdxs = smoothdata(gradient(f)./gradient(x), 'movmedian',WinLen); % ... Then, Smooth Them Again ...
aaa = smooth(dfdxs);
plot(x, aaa)
This is what i get
Here is the expected results.
  2 Comments
VBBV
VBBV on 29 Mar 2022
Change the line
WinLen = 35%
Modify this value and you can observe the difference
Andi
Andi on 29 Mar 2022
Still not expected, I have tried with different values but no significant change.

Sign in to comment.

Accepted Answer

Mathieu NOE
Mathieu NOE on 29 Mar 2022
hello
tried a few options , smoothdata and spline fit. In fact, I was thinking you wanted something super smooth so I first opted for the spline fit. But I saw that the expected plot was supposed to have some waviness , so I changed to smoothdata
NB as I don't have the toolbox for ecdf , I found an alternative on FEX for it https://fr.mathworks.com/matlabcentral/fileexchange/32831-homemade-ecdf
but I had to correct a bug (the corrected function is attaced fyi)
hope the code below can help you
clear all
clc
X = load('R_0.01T.csv'); % input data
SzX = size(X); % matrix size
r=X(1,:); % first row of the data set
ra = r(~isnan(r)); % remove all nan enteries
% [f,x,flo,fhi] = ecdf(ra);
[f,x] = homemade_ecdf(ra); % FEX : https://fr.mathworks.com/matlabcentral/fileexchange/32831-homemade-ecdf
% remove duplicates
[x,ia,ic] = unique(x);
f = f(ia);
% resample the data (interpolation) on linearly spaced points
xx = linspace(min(x),max(x),100);
ff = interp1(x,f,xx);
% smoothing
ffs = smoothdata(ff, 'loess',40);
% or spline fit ?
% Breaks interpolated from data (log x scale to get more points in the
% lower x range and less in the upper x range)
breaks = logspace(log10(min(x)),log10(max(x)),5); %
p = splinefit(xx,ff,breaks); %
ff2 = ppval(p,xx);
figure(1),
plot(x,f,'*',xx,ffs,'-',xx,ff2,'-')
legend('raw','smoothed','spline fit');
% compute gradient from spline fitted data (my choice)
dx = mean(diff(xx));
dfdxs = gradient(ffs)./dx;
figure(2),plot(xx, dfdxs)
  14 Comments
Andi
Andi on 31 Mar 2022
Edited: Andi on 31 Mar 2022
Thank a lot. I think now i very near to the expected results. I have modified your script a bit to actual condition and plot the results. But woundering, why i unable to set color bar as jet, secondly the tick bar are not required only few ont power termed be fine.
clear all
clc
X = load('PDS_case.csv'); % input data
C_sort = sortrows(X,5);
Tri = C_sort(1:86,:);
data1 = sortrows(Tri,7);
PDS=(data1(:,7))';
data2=data1(:,8:66);
data4=data2';
Y=data4(:,1:86);
X=Y';
UU=[2e-8, 4e-8, 6e-8, 8e-8, 1e-7, 3e-7]; % these values linked with each color bar, should be make a color bar
r{1}=[X(1:60,:)];
r{2}=[X(61:68,:)];
r{3}=[X(69:76,:)];
r{4}=[X(77:79,:)];
r{5}=[X(80:81,:)];
r{6}=[X(82:85,:)];
figure
cm = colormap(jet(numel(r)));
cc = colorbar('vert');
set(cc,'Ticks',((1:numel(UU))/numel(UU))');
set(cc,'TickLabels',num2str(UU'));
ylabel(cc,'PDS')
hold on
for k = 1:numel(r) % ... Then, Remove The 'NaN' Elements ...
ra = r{k};
ra = ra(~isnan(ra));
[f ,x ] = ecdf(ra);
[x ,ia ,ic ] = unique(x);
f = f(ia );
xx = linspace(min(x ),max(x ),100);
ff = interp1(x ,f ,xx );
ffs = smoothdata(ff , 'loess',10);
dx = mean(diff(xx ));
dfdxs = gradient(ffs )./dx ;
plot(xx , dfdxs , '-', 'linewidth', 1, 'Color',cm(k,:))
ylim([0, 8])
end
hold off
grid
Here is what i get
and the expected one should be like
Mathieu NOE
Mathieu NOE on 1 Apr 2022
hello again
1/ seems the expected curves are smoother compare to what we have now. You can probably increase the smoothing (driven by the value in smoothdata parameter)
ffs = smoothdata(ff , 'loess',10); % increase 10 =>20
2/ scale of the colorbar (ticks) : I don't know what are the values you specify in UU coming from ?

Sign in to comment.

More Answers (0)

Categories

Find more on Preprocessing Data in Help Center and File Exchange

Tags

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!