Preallocation of composites using smpd
Show older comments
I'm seeing dramatically non-linear execution times with the test code below where I am allocating a different gpu for up to 4 spmd worker. (Yes, I do have the hardware) I'll then make some work on each worker and time it for 10 trials.
Note the clear line w/in the trial loop but outside the smpd loop.
If that clear is included the trial_times make sense. If that clear is not included the trial_times do not make sense.
As an example when n_gpu's = 2 with the clear produces values in trial_time with a narrow range of 0.0938 to 0.1111, but w/out the clear I get 0.3884 0.0915 6.4601 15.2599 15.2746 15.2792 15.2892 15.2900
....I'm left pondering that if this were not smpd code I would find a way to preallocate the data, but I'm not sure how to do that with composites in this case.
Ideas and explanations are welcome.
for N_gpus=1:4
poolobj = gcp('nocreate'); % If no pool, do not create new one.
if isempty(poolobj)
poolobj = parpool( N_gpus );
poolsize = poolobj.NumWorkers;
else
poolsize = poolobj.NumWorkers;
end
for trial=1:10
spmd( N_gpus )
g = gpuDevice();
end
tic
spmd( N_gpus )
for m=1:50
A = rand(5000,5000,'gpuArray');
B = rand(5000,5000,'gpuArray');
C = A * B;
max_C = max(C);
end
end
clear A B C; %%THIS IS THE INTERESTING LINE
trial_time(trial)=toc;
end
trial_time;
tt = mean(trial_time(1:10));
fprintf( 'N=%d time=%6.3f \n', N_gpus, tt );
poolobj = gcp( 'nocreate' );
delete( poolobj );
end
Accepted Answer
More Answers (0)
Categories
Find more on Parallel Computing Toolbox in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!