Parallel Computing Toolbox (parfor slower than for, GPU slower than CPU)
2 views (last 30 days)
Show older comments
- How can I measure the time transfer to workers?
More precisely when doing parfor like:
parfor i=1:100
N=sum(A(:,:,i))
end
how can I measure the time transfer of the array A to each worker? here A is an array of 200x200x100
Communication overhead: The specified variable appears inside a loop within different indexing expressions. Because the indices are inconsistent across the uses of the array created by the parfor loop, MATLAB sends the entire array to each worker, resulting in high data communication overhead. For example, the following code elicits this message for c, because there are two different indexing expressions for it.
2. GPU: Is there any functions like randn, randi, randsample for the GPU?
Ex: I randomly select some indexes
index = (randi([1 Nsources],1,Nsources_mut));
pob_out(index,1,2) = poblacion(index,1,2)
funpage develops in parallel on the GPU this operation:
CG=pagefun(@mtimes,AGpage,Fs);
(here Apage gpuArray of 200x200x100, and Fs 200 x200) In this way the GPU is doing 100 multiplications in parallel
How can I perform this operation in parallel on the GPU?
aG=sum(sum(AGpage.*Fs)); (I)
This operation is equivalent to
for i=1:100
A=AGpage(:,:,i); (II)
aG=trace(A'*Fs);
end
Are the functions trace or sum(sum(.) available on pagefun? Operations (I) and (II) are too slow.
3.If I done all this calculation with Matlab 2021, are they going to run in 2017? Since I have some colleagues that only have that version.
3 Comments
Joss Knight
on 10 Apr 2021
Okay, I'll believe you're not a bot. But what is this snippet of commentary that doesn't seem relevant and refers to code that isn't provided?:
"Communication overhead: The specified variable appears inside a loop within different indexing expressions. Because the indices are inconsistent across the uses of the array created by the parfor loop, MATLAB sends the entire array to each worker, resulting in high data communication overhead. For example, the following code elicits this message for c, because there are two different indexing expressions for it."
It then refers to "funpage" instead of pagefun and goes on to make mistakes with markup.
Answers (2)
Joss Knight
on 10 Apr 2021
1) CPU/parfor: How can I measure the time transfer when doing parfor (since parfor is slower than for when calling to a part of an array).
Your snippet of code indexes variable A like this: A(:,:,i). Because i is the loop variable, this should result in only the correct slices of A going to each worker. So the premise you state that the whole array is copied is (should be) incorrect.
There isn't a way that I know of to measure the data transfer overhead independently in a parfor, since data transfer and loop execution are interleaved. You can probably infer it from wall clock timings measured by tic and toc - perhaps someone else has some tricks up their sleeve.
2) GPU: Is there any functions like randn, randi, randsample for the GPU? I need to use it to random select some cordinates of an array at each loop.
Yes. Use 'gpuArray' as an optional argument to rand, randn or randi.
3) pagefun: Includes matrix multiplication. But, how can I performe sum(sum(A)) in parallel on the GPU?
sum(sum(A)) works on the GPU. Also sum(A,'all') or sum(A,[1 2]). It's a good idea to actually try things before asking a question!
0 Comments
Oscar Martinez
on 12 Apr 2021
Edited: Oscar Martinez
on 12 Apr 2021
1 Comment
Joss Knight
on 13 Apr 2021
It looks like you want to index A randomly inside the loop which is why parfor can't successfully slice A. Maybe there's a way for you to index A in order, but write the results out in a randomized way? Otherwise I don't think I can help you here.
Timing: You need to give the GPU enough work to do to see the benefits:
>> F = @(n,s)randi([1 n],1,n,s);
>> timeit(@()F(100,'double'))
ans =
9.8483e-06
>> timeit(@()F(100,'gpuArray'))
ans =
4.7101e-05
>> timeit(@()F(100000,'double'))
ans =
0.0021
>> gputimeit(@()F(100000,'gpuArray'))
ans =
1.1839e-04
SUM: You are right, you cannot specify a vector of dimensions as an argument to sum in R2017a. Instead, either use sum(sum(A)), or, for extra efficiency:
numPages = size(A,3);
sumA = sum(reshape(A,[],numPages));
sumA = reshape(sumA,1,1,numPages);
See Also
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!