Shared variable / memory usage / parfor

31 views (last 30 days)
Julio Sosa
Julio Sosa on 25 Jul 2021
Commented: Edric Ellis on 27 Jul 2021
Hello,
I am wondering what are the real memory requirements to perform a computation like this:
A = ones(10^4);
parfor j=1:100
v = zeros(1,10^4);
r = v*A;
end
I've ran this code on R2018a and linux mint 19. By inspecting memory usage using 'top' it seems variable A is not shared but otherwise copied. Memory usage of each worker is almost size(A) in particular at the beginning of the computation. Later, memory usage of each worker drops to a number approximately size(A)/2. This is a real issue when the size of A is almost all the available RAM, since in theory I should be able to compute such a large A.
Please note that the real code I am trying to develop is not the one presented here, but it is analogous.
Is there any way to enforce communication rather than copy matrix A to every worker?
Thanks.

Answers (1)

Edric Ellis
Edric Ellis on 26 Jul 2021
The workers in a "local" parallel pool (the default) are separate processes. Therefore each worker process must have a complete copy of A in memory for this code as written. One thing to be aware of: large matrix operations are typically intrinsically multithreaded by MATLAB itself, so there is often no benefit to running N worker processes instead of N threads if most of the time in your program is taken up with large matrix operations (and, in fact, sending the data to and from the worker processes can often dominate the time taken).
  2 Comments
Julio Sosa
Julio Sosa on 26 Jul 2021
Edited: Julio Sosa on 26 Jul 2021
Thanks for the answer. Although I don't know any implementation details, I do not see why a local copy is required, since matrix A is never changed. To me, it's crucial to be able to share a variable across workers. I am surprised there is no hay to do this. I have only 256gb ram and I need to use most of it to store generated values, and workers use those values. It's impossible to share it using a file.
In the example I've posted, it's correct, there is no benefit of parallelizing, but as soon as you have some for loops in each worker, you get some gains in parallelizing. It's not the same to wait 1 week for a result than waiting 2 weeks.
Edric Ellis
Edric Ellis on 27 Jul 2021
The local copy is required because the workers are separate processes, and there is currently no way using PCT for MATLAB to share the contents of an array across multiple processes. You might be able to try using https://www.mathworks.com/matlabcentral/fileexchange/28572-sharedmatrix . In recent versions of MATLAB, arrays can be shared across the threads of a thread parpool - https://www.mathworks.com/help/parallel-computing/parallel.threadpool.html .

Sign in to comment.

Categories

Find more on Parallel Computing Fundamentals in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!