Clear Filters
Clear Filters

The parallel pool shut down because the client lost connection to worker

83 views (last 30 days)
Related to the question asked in https://it.mathworks.com/matlabcentral/answers/2058094-aws-matlab-parallel-server-what-is-the-best-strategy i followed that strategy just like a PoC (28 Simulations - 28 Workers of 32 - 2 Machine of 16 Workers) but I encounter the following error:
"The parallel pool shut down because the client lost connection to worker 21. Check the network connection or restart the parallel pool with 'parpool'"
I don't understand why there's a newtork connection since MATLAB Simulink has been running on AWS Cloud.
Also, how can I continue execution on the remaining parallel workers even if an error occurs?
Thank you very much
  8 Comments
Damian Pietrus
Damian Pietrus on 3 Jan 2024
Before we try accessing those logs, I have one more thing for you to try since the behavior seems to be similar to an issue I ran into with another user. After starting up your MATLAB client before running any jobs, run the following command:
setenv('MW_PCT_TRANSPORT_HEARTBEAT_INTERVAL', '600')
In this case, we are setting a communication timeout in the cluster to 600 seconds (10 minutes). Try running your code again and let me know how things go. If it's successful, I'll pass it along to our development team
Torben Ellegaard Lund
Torben Ellegaard Lund on 28 Jun 2024
Edited: Torben Ellegaard Lund on 28 Jun 2024
I had a smimilar problem on my MAcBook Air M2 running Sonoma 14.4.1 and MATLAB R2024a (24.1.0.2537033) where a parfor look kept chrashing. After using
setenv('MW_PCT_TRANSPORT_HEARTBEAT_INTERVAL', '600')
some improvement was seen but it still kept chrashing (although after longer time). I then used the following:
setenv('MW_PCT_TRANSPORT_HEARTBEAT_INTERVAL', '6000')
instead, and I have not experienced any chrashes since. The chrash used to happen both when storing data on a local SSD-harddrive and on an external SSH-harddrive connected via USB-C.

Sign in to comment.

Answers (0)

Categories

Find more on Cluster Configuration in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!