How to fit lognormal distribution on my data

34 views (last 30 days)
Hi all,
I have 2 vectors. One is the radious of pores within the soil and other is the number of those pores(frequency). Can anyone help me how I can fit a lognormal distribution to these data?

Accepted Answer

Jeff Miller
Jeff Miller on 17 Nov 2022
If the scores are in vector1 and the counts are in vector 2, you could fit like this by the method of moments (should be OK with a large dataset):
probs = vector2 / sum(vector2); % Convert counts to probabilities
ExpOfY = sum(vector1.*probs); % mean of the scores tabulated in vectors 1 & 2
Ysqr = vector1.^2;
ExpOfYsqr = sum(Ysqr.*probs);
VarOfY = ExpOfYsqr - ExpOfY^2; % variance of the scores tabulated in vectors 1 & 2
normu = log(ExpOfY/sqrt(1+VarOfY/ExpOfY^2));
norvar = log(1+VarOfY/ExpOfY^2);
norsigma = sqrt(norvar);
dist = makedist('Lognormal','mu',normu,'sigma',norsigma);
x = linspace(0.1,20); % Use whatever range is appropriate for your data
pdfOfx = pdf(dist,x);
figure; plot(x,pdfOfx);
  2 Comments
Behrooz Daneshian
Behrooz Daneshian on 17 Nov 2022
Thank you Jeff for your reply. I did exacly what you recommended to do. However, the results seems weird to me. Please look at the attached picture. The red plot belongs to radious(vector1) vs frequency(vector2) and the blue plot is for fitted lognormal pdf. Can you see the huge discrepency between the two curves?
Jeff Miller
Jeff Miller on 17 Nov 2022
Edited: Jeff Miller on 18 Nov 2022
Part of the problem is that the red and blue curves are scaled differently. To make them comparable, you need to make them have the same total area under the two curves. The area under the fitted lognormal is 1, but the area under the red curve is clearly much smaller. I think need something like:
vector2nor = vector2 / trapz(vector1,vector2);
plot(vector1,vector2nor);
Also, it does not look like the lognormal is actually a very good fit, because the two curves have pretty different shapes (blue descends much more than red across the last 6 tics before 10^-4).

Sign in to comment.

More Answers (1)

David Hill
David Hill on 15 Nov 2022
Look at lognfit
[pHat,pCI] = lognfit(repelem(vector1,vector2));
  2 Comments
Behrooz Daneshian
Behrooz Daneshian on 16 Nov 2022
I did that. but I faced an error ''Requested array exceeds the maximum possible variable size.''
I have freqeuncies in order of 10^14.Hence, I guess I can not combine frequency matrix into the vector1.
David Hill
David Hill on 16 Nov 2022
if all the frequencies are of the same order of magnitude, you could scale them all down.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!