Why does chi2gof rejects normal hypothesis when giving normal distributed data?
Show older comments
Hi,
I was trying to use the chi2gof function. For this I created normally distributed data. So I'd assume that chi2gof should indicate a small p value (to reject null-hypothesis). But no matter how big I increase the sample size, the function will not reject. Further when plotted, it is obvious in my opinion, that it is normally distributed (as I assume that makedist works well).
Can anyone explain that behaviour? Do I make a mistake? Or am I missjudging the purpose of this test?
% create normally distributed data
pd = makedist('Normal');
rng default; % for reproducibility
x = random(pd,5000,1);
% display data with histfit
histfit(x)
% run chi2gof
[h,p,stat] = chi2gof(x)
h =
0
p =
0.1967
Edit: I mixed up H0 and H1, so everything is as expected.
%% chi squared tests for goodnes of fit
% test if the data follows a defined distribution
% create normally distributed data
pd = makedist('Normal');
rng default; % for reproducibility
x = random(pd,5000,1);
% add strong linear effect -> data won't follow normal distribution
x = x + +1:5000;
histfit(x)
% run chi2gof
[h,p,stat] = chi2gof(x)
% result
h =
1
p =
2.1686e-68
So h == 1 means it's not following the expected distribution.
Answers (1)
You need a even bigger sample size for achieving high p value.
pd = makedist('Normal');
rng default; % for reproducibility
x = random(pd,500000,1);
% display data with histfit
histfit(x)
% run chi2gof
[h,p,stat] = chi2gof(x)
1 Comment
Chris Loyt
on 11 Jun 2021
Edited: Chris Loyt
on 11 Jun 2021
Categories
Find more on Hypothesis Tests in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!