# How to plot a smooth curve for Empirical probability density ?

5 views (last 30 days)
Adnan Barkat on 29 Mar 2022
Commented: Mathieu NOE on 1 Apr 2022
Hi everyone,
I require to plot an empirical probability density curve of a data set (184 rows, 59 columns). Initially, I pick one column to plot the results but, the EPD curve looks so wired. May someone suggest to me how can I get desired result (Attached). Data is also attached for reference.
clear all
clc
X = load('R_0.01T.csv'); % input data
SzX = size(X) % matrix size
r=[X(1,:)]; % first row of the data set
ra = r(~isnan(r)); % remove all nan enteries
[f,x,flo,fhi] = ecdf(ra);
WinLen = 5
dfdxs = smoothdata(gradient(f)./gradient(x), 'movmedian',WinLen); % ... Then, Smooth Them Again ...
aaa = smooth(dfdxs);
plot(x, aaa)
This is what i get
Here is the expected results.
##### 2 CommentsShowHide 1 older comment
Adnan Barkat on 29 Mar 2022
Still not expected, I have tried with different values but no significant change.

Mathieu NOE on 29 Mar 2022
hello
tried a few options , smoothdata and spline fit. In fact, I was thinking you wanted something super smooth so I first opted for the spline fit. But I saw that the expected plot was supposed to have some waviness , so I changed to smoothdata
NB as I don't have the toolbox for ecdf , I found an alternative on FEX for it https://fr.mathworks.com/matlabcentral/fileexchange/32831-homemade-ecdf
but I had to correct a bug (the corrected function is attaced fyi)
hope the code below can help you
clear all
clc
X = load('R_0.01T.csv'); % input data
SzX = size(X); % matrix size
r=X(1,:); % first row of the data set
ra = r(~isnan(r)); % remove all nan enteries
% [f,x,flo,fhi] = ecdf(ra);
[f,x] = homemade_ecdf(ra); % FEX : https://fr.mathworks.com/matlabcentral/fileexchange/32831-homemade-ecdf
% remove duplicates
[x,ia,ic] = unique(x);
f = f(ia);
% resample the data (interpolation) on linearly spaced points
xx = linspace(min(x),max(x),100);
ff = interp1(x,f,xx);
% smoothing
ffs = smoothdata(ff, 'loess',40);
% or spline fit ?
% Breaks interpolated from data (log x scale to get more points in the
% lower x range and less in the upper x range)
breaks = logspace(log10(min(x)),log10(max(x)),5); %
p = splinefit(xx,ff,breaks); %
ff2 = ppval(p,xx);
figure(1),
plot(x,f,'*',xx,ffs,'-',xx,ff2,'-')
legend('raw','smoothed','spline fit');
% compute gradient from spline fitted data (my choice)
dx = mean(diff(xx));
figure(2),plot(xx, dfdxs)
Mathieu NOE on 1 Apr 2022
hello again
1/ seems the expected curves are smoother compare to what we have now. You can probably increase the smoothing (driven by the value in smoothdata parameter)
ffs = smoothdata(ff , 'loess',10); % increase 10 =>20
2/ scale of the colorbar (ticks) : I don't know what are the values you specify in UU coming from ?

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!