How to reduce number of loops

Hello,
I have code, which assembles coefficients into large sparse matrix. My coding seems to be inefficient as I use 3 loops, so I'm trying to improve the code to get better pefromance. I think it would be nice to reduce the number of loops to single loop and run it parallel (parfor), but I'm struggling to do that..
The main matrix sig is a matrix of size LX,LY,LZ and vectors dx, dy, dz holds the distances between nods along the diretion X, Y an Z.
for i = 2:LX-1
for j = 2:LY-1
for k = 2:LZ-1
Ctop(i,j,k) = (-1/dz(k-1))*((sig(i-1,j ,k-1)*((dx(i-1)*dy(j ))/4))+...
(sig(i ,j ,k-1)*((dx(i )*dy(j ))/4))+...
(sig(i-1,j-1,k-1)*((dx(i-1)*dy(j-1))/4))+...
(sig(i ,j-1,k-1)*((dx(i )*dy(j-1))/4)));
end
end
end
C1v=reshape(Ctop,[],1);
thanks for help!

4 Comments

What would be typical values for LX, LY, LZ? Is it normal that upper boundaries are {LX,LY,LZ}-1 when you use no {i,j,k}+1 in you computation? It means that Ctop(LX,:,:), Ctop(:,LY,:), and Ctop(:,:,LZ) are not defined. Do you prealloc Ctop before the triple loop?
Hi Cedric.
Typical values assigned to LX, LY or LZ ranging from 50-200, so the number of runs is normally about 200 000. There are much more variables assembling during this triple loop but they are similar to this one, so I reduced the code just to make an example. Of course all the variables inside the loop are preallocated.
And actually this is correct that Ctop has the size of sig and that Ctop(LX,:,:), Ctop(:,LY,:), Ctop(:,:,LZ), Ctop(1,:,:), Ctop(:,1,:), and Ctop(:,:,1) are 0's ?
Values assigned to those matrix positions are evaluated separately by another code as the boundary conditions are applied there.

Sign in to comment.

 Accepted Answer

Ok, well, you could adapt the following to your case then. I am not checking in-depth that it is working though, and there is room for improvement as I didn't optimize indexing for managing properly boundaries..
n = 100 ;
sig = 20 * rand(n,n,n) ;
LX = size(sig,1) ;
LY = size(sig,2) ;
LZ = size(sig,3) ;
dx = 0.1 + rand(LX, 1) ; % Made irregular for testing..
dy = 0.8 + zeros(LY, 1) ;
dz = 0.3 + zeros(LZ, 1) ;
Ctop = zeros(size(sig)) ;
% - Your code.
tic ;
for i = 2:LX-1
for j = 2:LY-1
for k = 2:LZ-1
Ctop(i,j,k) = (-1/dz(k-1))*((sig(i-1,j ,k-1)*((dx(i-1)*dy(j ))/4))+...
(sig(i ,j ,k-1)*((dx(i )*dy(j ))/4))+...
(sig(i-1,j-1,k-1)*((dx(i-1)*dy(j-1))/4))+...
(sig(i ,j-1,k-1)*((dx(i )*dy(j-1))/4)));
end
end
end
toc
% - Meshgrid based code.
Ctop_mesh = zeros(size(Ctop)) ;
tic ;
[dy_m0, dx_m0] = meshgrid(dy(2:end-1), dx(2:end-1)) ;
[dy_m1, dx_m1] = meshgrid(dy(1:end-2), dx(1:end-2)) ;
sig_m00 = sig(2:end-1,2:end-1,:) ;
sig_m10 = sig(1:end-2,2:end-1,:) ;
sig_m01 = sig(2:end-1,1:end-2,:) ;
sig_m11 = sig(1:end-2,1:end-2,:) ;
for k = 2 : LZ-1
Ctop_mesh(2:end-1,2:end-1,k) = (-1/dz(k-1)) / 4 * ...
(sig_m10(:,:,k-1) .* dx_m1 .* dy_m0 + ...
sig_m00(:,:,k-1) .* dx_m0 .* dy_m0 +...
sig_m11(:,:,k-1) .* dx_m1 .* dy_m1 +...
sig_m01(:,:,k-1) .* dx_m0 .* dy_m1) ;
end
toc
% - Check.
tmp = abs(Ctop_mesh(2:end-1,2:end-1,2:end-1)-Ctop(2:end-1,2:end-1,2:end-1)) ;
all(tmp(:) < 1e-12)
This code outputs:
Elapsed time is 6.281855 seconds. % 3 nested loops
Elapsed time is 0.408983 seconds. % meshgrid
ans = % check
1
Note that it is possible to get rid of the last FOR loop and work with a 3D array for dz built this way:
dz_m1 = permute(repmat(dz(1:end-1),[1,LX-1,LY-1]), [2,3,1]) ;
but I am not sure that it is more efficient (especially if you want to PARFOR the remaining loop).

More Answers (0)

Categories

Find more on Fuzzy Logic Toolbox in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!