Why does MATLAB hang when using "parpool" with more than N workers on Linux?

I am using a large Linux machine and I tried to create a parallel pool via the following MATLAB command:

>> parpool(N)
However, for N around 15 or larger (where N is the pool size or number of workers), this command may hang or fail.

 Accepted Answer

This issue may be due to the system limit set for the number of threads allowed per user in Linux being too low to support the number of MATLAB workers being requested. In such cases, MATLAB might hang as you have observed.

Local machine:

From R2021a onwards the behaviour in these situations has improved and where possible Parallel Computing Toolbox will try and increase the user's soft process limit or warn if the product detects the process limit is too low to support the number of workers available on the machine.
The following documentation page provides
which should work in most circumstances.
If a large constant is unsatisfactory iteratively increasing the hard process limit until the parallel pool of the desired size can be launched may help. In some environments e.g. Docker containers or certain virtual machines, process limits may be imposed by the virtualisation or container software, editing the ulimits within the environment will not be sufficient to launch large numbers of MATLABs.

MATLAB Job Scheduler Clusters:

For MJS clusters, the suggested guidance is to set the process limit to 128*W where W is the number of workers on each machine. This suggestion is a conservative estimate, you may be able to set the limit lower depending on the code you expect the MATLAB workers to be executing.

More Answers (0)

Categories

Products

Release

R2021a

Tags

No tags entered yet.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!