Request for help computing convolution of random variables via fft
22 views (last 30 days)
Show older comments
Jeff Miller
on 26 Aug 2021
Commented: Jeff Miller
on 30 Aug 2021
Inspired by the answers of @David Goodmanson to this earlier question, I have been trying to understand exactly how in general to compute the pdf of a convolution of random variables (RVs) using ffts. The attached script shows two of the examples that I have attempted. In one example the fft produces the convolution pdf perfectly, but in the other it doesn't, and I'm hoping someone can identify the problem with my method.
The first example looks at the convolution of several identical exponential RVs. Here, the convolution pdf computed via fft agrees perfectly with the corresponding known gamma pdf for the convolution.
The second example shows the convolution of two different normal RVs, but the convolution pdf computed via fft does not agree with the corresponding known normal convolution. Actually, the fft convolution does have exactly the right shape, but it is offset relative to the true convolution distribution (i.e., it has a different mean).
I've also tried the fft approach with a number of other example distributions, and the same sort of shift problem that I'm having with the normals is quite common. In any particular example I can easily get rid of the shift in an ad hoc manner, but I haven't yet hit on a general method that works for any arbitrary set of RVs (identical or not).
As I have not worked with fft's before, I am probably missing something quite fundamental. It would be great if one of the experts in this group could get me straightened out.
Thanks very much for any hints.
0 Comments
Accepted Answer
David Goodmanson
on 27 Aug 2021
Edited: David Goodmanson
on 27 Aug 2021
Hi Jeff,
It's nice to see a contribution of mine from almost four years ago come up, as just one of many people who have made contributions to this site and get referenced as well. Besides providing answers to many queries, the site appears to have become a searchable source of information.
The reason that the first conv example works and the second doesn't is due to the different x arrays used to create the pdfs. Suppose the fft transforms y = y(x) to z = z(ilam) (ilam being inverse wavelength, which seems to have no common symbol for it). The fft puts z( ilam=0 ), the DC value, as the first element in the resulting z array. Similarly the fft assumes that y( x=0 ) is the first element in the y array. If x=0 is somewhere else, then z picks up a phase shift in the ilam domain that can lead to incorrect results.
Your first example uses the correct x array and the method works. The second example starts out at a negative value of x to accommodate the gaussian, leading to problems as you can see.
The most straightforward approach to allow negative values of x is to start with an x array with x=0 at the midpoint (element n/2+1 for even n). Define y(x) and then use ifftshift to put the DC y value at y(1). I tend to think of the first array as a plotting array and the shifted array as the real array. After some ffts and then an ifft to get back in the x domain, fftshift up to put y( x=0 ) back in the center. For example,
N = 10000;
x0 = 5;
delx = 2*x0/N;
delilam = 1/(N*delx);
x = (-N/2:N/2-1)*delx;
mu = .1;
sig = .2;
m = 6; % number of convolutions
mp1 = m+1;
y = (1/(sqrt(pi)*sig))*exp(-(x-mu).^2/sig^2);
% exact result
yconv = (1/(sqrt(pi*mp1)*sig))*exp(-(x-mp1*mu).^2/(mp1*sig^2));
% by fft
z = fft(ifftshift(y)).^mp1*delx^mp1;
yconvfft = fftshift(ifft(z))*N*delilam;
% in the following plot the original gaussian is scaled down
figure(1); grid on
plot(x,y/2,x,yconvfft,x(1:60:end),yconv(1:60:end),'o')
I1 = trapz(x,y)
I2 = trapz(x,yconv)
I3 = trapz(x,yconvfft)
For normalization, since you are using the fft (which only does sums) to approximate an integral, there is a factor of the array spacing del_x for each transform. Coming back there is one ifft so there is a single factor of del_ilam. Except the ifft automatically divides by the number of points N, so you need to multiply by N to make up for that.
I do have one question. In the second example you have 2^13 = 8192 points instead of, say, 10000. Could you let me know the rationale behind using 2^n points?
12 Comments
Paul
on 29 Aug 2021
Edited: Paul
on 29 Aug 2021
I guess I didn't understand the code as well as I thought!
But it does appear that each pdf is evaluated over the same X vector (M(i,:)), which might be a lot longer than it needs to be to evalaute any one particular pdf. Probalby only a very small effect on efficiency.
At the end of the day, it looks like the result will be an approximation to the true answer that is piecewise linear with a finite interval of support. Any concerns about missing the tails? Or put another way, does every density under consideration have a well defined xmin/xmax outside of which you're comfortable assuming the density is zero?
Wil lthe code have to adaptively pick NxPoints to ensure that delx = X(2) - X(1) is small enough to adequately capture the features of all of the densities under consideration?
I guess the answers depend on what is actually going to be done with the result. Make a plot? Compute other figures-of-merit, like the third moment, etc.
Interesting problem.
More Answers (1)
Paul
on 28 Aug 2021
Here is one option, illustrated with two widely separated normal pdfs:
mu1 = 10; sigma1 = 2;
mu2 = -5; sigma2 = 1;
Zextreme = 4;
delx = 0.01;
% Determine range of x values
min1 = mu1 - Zextreme*sigma1;
min2 = mu2 - Zextreme*sigma2;
max1 = mu1 + Zextreme*sigma1;
max2 = mu2 + Zextreme*sigma2;
x1 = min1:delx:max1;
x2 = min2:delx:max2;
x = mu1 + mu2 + (-Zextreme*(sigma1 + sigma2):delx:Zextreme*(sigma1 + sigma2));
% Get the pdfs of the individual distributions and of the sum, which is the
% convolution of the individuals
M1 = normpdf(x1,mu1,sigma1);
M2 = normpdf(x2,mu2,sigma2);
truepdf = normpdf(x,mu1+mu2,sqrt(sigma1^2+sigma2^2));
% Check the computations so far by examining the true distributions
% of the individual RVs and their convolution
plot(x1,M1);
hold on
plot(x2,M2);
plot(x,truepdf);grid
legend('Normal 1', 'Normal 2', 'Convolution');
% approximate the continuous convolution with a convolution sum in the time domain
convpdf = conv(M1,M2)*delx;
shift1 = round(x1(1)/delx);
shift2 = round(x2(1)/delx);
% approximate the continuous convolution with a convolution sum computed via the frequency domain
N = numel(convpdf);
fftM1 = fft(M1,N);
fftM2 = fft(M2,N);
fftpdf = fftM2.*fftM1;
fftpdf = real(ifft(fftpdf));
% compare
subplot(311); plot(x,truepdf);grid
subplot(312); plot( ((0:N-1) + shift1 + shift2)*delx,convpdf),grid
subplot(313); plot( ((0:N-1) + shift1 + shift2)*delx,real(fftpdf)*delx),grid
set(get(gcf,'Children'),'XLim',[x(1) x(end)]);
3 Comments
Paul
on 28 Aug 2021
I think that one important factor is if you are taking the sum of RVs that are i.i.d. as in your Example 1 and in @David Goodmanson's example or if the RVs are not i.i.d. as in your Example 2 and in the example I posted.
See Also
Categories
Find more on Transforms in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!