How to stop all workers simultaneously when an error occurs in one of the workers?

3 views (last 30 days)
Hi guys
I am working with parpool with n number of workers. It is likely that one of the workers returns error at some points. So, I would like to catch error by means of:
parfor i = 1:length(Data)
Try
Simulation(i);
catch ME
stop all workers; % Not the parpool.I want the workers to stop doing %simulations. I do not want them to be closed
change something in Simulation(i);
start workers to do simulation(i);
continue;
end
end
and make some changes and start workers again.
Could you please let me know how to handle it?
Regards,
Vahid

Answers (2)

Edric Ellis
Edric Ellis on 27 Jul 2015
You can do this using parfeval to send off individual tasks for execution on the workers, and then you can call cancel() on those tasks if you spot an error. Something like this:
% Initiate the work on the workers:
for i = 1:length(Data)
f(i) = parfeval(@Simulation, 1, i);
end
% Check the results, cancel all execution if an error is spotted
completedSuccessfully = true;
for i = 1:length(f)
try
[idx, result] = fetchNext(f);
catch E
% Get here if a simulation threw an error
cancel(f);
completedSuccessfully = false;
break;
end
end
if ~completedSuccessfully
% do stuff...
end

Walter Roberson
Walter Roberson on 24 Jul 2015
You can cancel() task objects. I think at one point I saw a way to determine all of the task IDs, but that is not something I have researched.
  2 Comments
Vahid Ghorbanian
Vahid Ghorbanian on 25 Jul 2015
Walter
Thank you for the response. How can I create object? I do not know if each worker has to have its own object or not. Does the object have to be introduced in the parfor or outside of it?!! Could you please sent a sample code to do what I need?
Walter Roberson
Walter Roberson on 25 Jul 2015
For example,
CreateTask(j, @Simulation, num2cell(1:length(Data)))
and once you have found an error and want to restart, perhaps use recreate(j)
At the moment I do not see a way to access the results of one task other than to know which state it is in. I have not used these facilities so I am likely overlooking something.

Sign in to comment.

Categories

Find more on Parallel Computing Fundamentals in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!