pearsrnd and pearspdf are not coherent for the Pearson IV distribution?

7 views (last 30 days)
We are having some trouble using the Pearson 4 implementation of the functions pearsrnd, pearspdf and pearscdf.
We think that pearsrnd does not generate samples from the distribution defined in pearspdf. In particular, we've found that pearsrnd over generates observations in the tails of the distribution.
In order to see this, we've implemented two simple tests:
1. We've compared the PDF of the Pearson 4 distribution (using pearspdf) to the PDF obtained from a simulation using pearsrnd, but we've found that the two are very different (even with a very long sample). This problem doesn't happen for other distributions. In particular, we've implemented the same test with the Pearson 6, and the results are satisfactory.
2. In addition, to test if the generatated sample is compatible with its CDF, we've applied the function pearscdf to the generated sample. We should have got a uniform distribution, because if X is a random variable and F its CDF, then F(X) is uniform in [0,1]. This is indeed what we've got with the Pearson 6, as expected, but not what we've got using the Pearson 4.
We've conducted a similar experiment using Mathematica, and we get the correct result.
We've attached the code we've used.
Any idea why we're getting this result?
Thank you very much.
%% Seed
rng(123)
%% Parameters
%Type IV
mu4 = 5;
sigma4 = 1;
skew4 = 1;
kurt4 = 10;
% Check the distribution type
[ign, type4] = pearsrnd(mu4, sigma4, skew4, kurt4, 1);
disp("This Pearson is of type " + type4)
This Pearson is of type 4
% Type VI
mu6 = 2;
sigma6 = 1;
skew6 = 2;
kurt6 = 10;
% Check the distribution type
[ign, type6] = pearsrnd(mu6, sigma6, skew6, kurt6, 1);
disp("This Pearson is of type " + type6)
This Pearson is of type 6
%% Simulate n observations according to both Pearson distributions defined above
n = 100000;
samples4 = pearsrnd(mu4, sigma4, skew4, kurt4, 1, n);
samples6 = pearsrnd(mu6, sigma6, skew6, kurt6, 1, n);
%% First test: we compare the simulated pdf (i.e. histogram of the sample) to the closed form PDF
pdf4 = @(x) pearspdf(x, mu4, sigma4, skew4, kurt4);
pdf6 = @(x) pearspdf(x, mu6, sigma6, skew6, kurt6);
x_ = linspace(-10, 10, 10000);
y4 = pdf4(x_);
y6 = pdf6(x_);
figure('Position', [100, 100, 1000, 400]); % [left, bottom, width, height]
subplot(1, 2, 1);
plot(x_, y4, Color='red', LineWidth=2)
title('Simulated and closed form PDF for Pearson 4')
hold on
histogram(samples4, 1000, 'Normalization', 'pdf', 'FaceColor', 'blue')
hold off
subplot(1, 2, 2);
plot(x_, y6, Color='red', LineWidth=2)
title('Simulated and closed form PDF for Pearson 6')
hold on
histogram(samples6, 1000, 'Normalization', 'pdf', 'FaceColor', 'blue')
hold off
sgtitle('Test 1: Comparison between Pearson 4 and 6');
%% Second test: Let X be a random variable and F its CDF, then F(X) is uniform in [0,1].
% We apply the Pearson CDF to the simulated observations and plot its
% histogram. We should obtain something resembling a uniform distribution.
figure('Position', [100, 100, 1000, 400]); % [left, bottom, width, height]
subplot(1, 2, 1);
u4 = pearscdf(samples4, mu4, sigma4, skew4, kurt4);
histogram(u4)
title('F(X) for Pearson 4')
subplot(1, 2, 2);
u6 = pearscdf(samples6, mu6, sigma6, skew6, kurt6);
histogram(u6)
title('F(X) for Pearson 6')
sgtitle('Test 2: Comparison between Pearson 4 and 6');
%% Conclusion
%We get the expected results using the Pearson 6, but we cannot say the
%same for the Pearson 4. Why?
  1 Comment
Drew
Drew on 22 Mar 2024
Edited: Drew on 2 Jun 2024
MathWorks has posted a bug report https://www.mathworks.com/support/bugreports/3219165
This has been fixed in R2024a and in R2023b Update 8. The bug report also includes a workaround for affected versions of 23b.

Sign in to comment.

Accepted Answer

David Goodmanson
David Goodmanson on 15 Feb 2024
Edited: David Goodmanson on 17 Feb 2024
Hello Marco,
It appears that you are implicating the pearson rand function here, but I believe the real culprit is the pearson pdf.
I have the rand function but not the pdf since the latter does not appear until Matlab 2023b. I wrote my own pearson4 pdf function, code is below, and it agrees pretty well with the Matlab pearson rand function (that plot is not shown).
After obtaining a new_cdf by numerical integration, a histogram of new_cdf(pearson rand) for comparison to the ones you did is shown below. And it makes sense.
mu = 5;
sdev = 1;
skew = 1;
kurt = 10;
n = 100000;
samples4 = pearsrnd(mu,sdev,skew,kurt, 1,n);
figure(1); grid on
histogram(samples4, 1000, 'Normalization', 'pdf', 'FaceColor', 'blue')
hold on
x = -20:.01:20;
f = pears4pdf(x,mu,sdev,skew,kurt);
plot(x,f)
hold off
% --------
function f = pears4pdf(x,mu,sdev,skew,kurt)
% pearson type 4 pdf
m = (5*kurt-6*skew^2-9)/(2*kurt-3*skew^2-6);
b = 2*(m-1)/sqrt((4*(2*m-3)/((m-2)^2*skew^2)) -1)*(-sign(skew));
J0 = J(m,0,b);
a = sdev*sqrt(J0/J(m,2,b));
z = (x-mu)/a;
z0 = -b/(2*(m-1));
arg = z+z0;
% f is normalized
f = (1/(J0*a))*(1+arg.^2).^(-m).*exp(-b*atan(arg));
end
% --------
function u = J(m,n,b)
% u = Int{-inf,inf} (z-z0)^n (1+z^2)^(-m) exp(-b*atan(z)) dz
% m > (n+1)/2
z0 = -b/(2*(m-1));
if m <= (n+1)/2
u = inf;
elseif n == -1
u = 0;
elseif n == 0
u = real((pi*gamma(2*m-1)) ...
/(2^(2*m-2)*exp(gammalni(m+b*i/2))*exp(gammalni(m-b*i/2))));
else
C = 2*(m-1)-(n-1);
u = ((n-1)/C)*(2*z0*J(m,n-1,b) + (1+z0^2)*J(m,n-2,b)) ;
end
end
% ---------
function w = gammalni(z)
% natural log of gamma function for input vector z with
% constant real part, for example z = 2 + i*(0:.01:6),
% by Stirling's approximation.
% see Whittaker and Watson, 4th edition, 12.33. note that
% error term for gamma(x+iy) is always less than that for gamma(x)
% David Goodmanson
%
% w = gammalni(z)
if any(diff(real(z))); error('Re(z) must be constant'); end
% these affect the precision of the calculation.
% higher values use more terms but in theory should be better.
zreal = 6; nterm = 5;
if real(z(1)) <(1/4) % use reflection property to keep x >0
end
reflect = 'true';
z = 1-z;
else
reflect = 'false';
end
m=0; % make Re(z) >= zreal so that |z| is large enough
while real(z(1)) < zreal
z = z+1;
m = m+1;
end
% c(i) = (-)^(i-1) B(i)/(2i)(2i-1), B are Bernoulli numbers
c(1) = 1/12; c(2) = -1/360; c(3) = 1/1260;
c(4) = -1/1680; c(5) = 1/1188; c(6) = -691/360360;
c(7) = 1/156; c(8) = -3617/122400; c(9) = 43867/244188;
c(10) = -174611/125400; c(11) = 77683/5796;
w = log(z).*(z-1/2) -z + log(sqrt(2*pi)); % log gamma for large z
zn = z;
for j=1:nterm;
zn = zn./(z.^2); % correction terms to log gamma are
w = w + c(j)*zn; % c(1) 1/z + c(2) 1/z^3 + ...
end
for n =1:m; % bring Re(z) back to its actual value.
z = z-1; % gamma(z-1) = gamma(z)/(z-1)
w = w -log(z);
end
% gamma(z)gamma(1-z) = pi/sin(pi z), but don't let argument
% of sine get too large and generate infinity
if strcmp(reflect, 'true')
absy = abs(imag(z));
w = log(pi) -w + -pi*absy ...
-log( ( exp(pi*(i*z-absy)) -exp(-pi*(i*z+absy)) )/(2*i) );
end
Histogram of new_cdf(pearsrnd). new_cdf is obtained by numerical integration of pears4pdf.
  10 Comments
Marco
Marco on 19 Feb 2024
Hi David and Paul,
I'd like to thank both of you for taking the time to investigate the problem and come up with a solution.
I've been contacted by Mathworks and hopefully the bug will be fixed soon!
Drew
Drew on 23 Feb 2024
Edited: Drew on 7 Jun 2024
MathWorks has posted a bug report which includes a workaround for affected versions of 23b. See https://www.mathworks.com/support/bugreports/3219165
This has been fixed in R2024a and in R2023b Update 8. The bug report also includes a workaround for affected versions of 23b.

Sign in to comment.

More Answers (1)

Drew
Drew on 23 Feb 2024
Edited: Drew on 2 Jun 2024
MathWorks has posted a bug report https://www.mathworks.com/support/bugreports/3219165
This has been fixed in R2024a and in R2023b Update 8. The bug report also includes a workaround for affected versions of 23b.

Products


Release

R2023b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!