is there codes that helps to find out the type of the distribution ?

1 view (last 30 days)
i have a set of data of a certain parameter that i need to know the statistical distribution belong to, is there any way or codes can help?

Accepted Answer

Rik
Rik on 25 Aug 2022
No. What you ask is fundamentally impossible.
What you can do is try several distributions and see how well they fit.
  4 Comments
Mohamed Zied
Mohamed Zied on 9 Feb 2023
Thank you for your answer.
I plotted many PDFs (using the distributionFitter tool) as you can see in the attached scrreenshot.
Is the judgement of the best fit graphical?
In my case for example, is the Lognormal distribution the best fit to my Data.
Thank you in advance.
Rik
Rik on 9 Feb 2023
What exactly is the best fit depends on your domain knowledge. Your data looks like a skewed normal distribution to me (or the superposition of two). What makes sense in your situation is not something I can tell you, but you need to tell me. And that would answer your own question.

Sign in to comment.

More Answers (1)

John D'Errico
John D'Errico on 25 Aug 2022
Edited: John D'Errico on 25 Aug 2022
As Rik said, difficult to know which distribution a random variable comes from. Don't believe me?
x = 0.96489;
Do you know if that number was generated from a uniform distribution? On what range? Was it from a normal distribution? A beta? Lognormal? Gamma? Weibull? Exponential? Poisson? Rayleigh? Lots more.
Even had I given you more data, you still cannot know. You can perform tests to see if one distribution would be more likely than another. (More data would be REALLY helpful then.)
You can use tools to fit a family of distributions. Common ones are the Johnson or Pearson family of distributions. Even then you won't KNOW which distribution a sample came from, only making a better guess. There are tools like fitdist and distributionFitter. I thought I remember the stats toolbox having a tool to fit the pearson family of distributions too. Ah, yes, it does, though I had to look. pearsrnd does it all.
The nice thing about the Pearson family (and the Johnson family as I recall) is they encompass a pretty wide variety of distribution shapes. But I have a funny feeling one can abuse them if you don't understand these things. And those tools are completely dependent on estimates of the first 4 moments of your distributino, but the higher order moments are difficult to estimate well.
Anyway, the fact is, almost NO data you will generate in the real world comes from a truly KNOWN distribution. All data will be corrupted in some way, so that even if it should be essentially normal, you will always have crap in there that makes it not quite normal, etc.

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!