nested smpd instructions

Hello,
I am trying to achieve a high degree of parallelism in my matlab implementation of a support vector machine. From the conceptual point of view the parallelization of a multiclass support vector machine (SVM) is not extremely hard. Indeed a multiclass SVM is composed by a set of binary SVMs that can be trained in parallel. Moreover, if I want to validate my model using a n-fold cross-validation I have another point in the code in which i can achieve parallelism.
I was thinking to parallelize my code by using nested spmd statements (outer spmd at the level of the n-fold crossvalidation, inner spmd at the level of the training of the multiclass SVM). If the inner spmd statement is in a function that is called by the outer spmd then the inner spmd sees always 0 workers available and the code crashes.
Is this normal or am I doing something wrong? Is there any alternative way to achieve parallelism on multiple levels? (parfor doesn´t do that)
Thanks for the answers!
Marco

Answers (3)

Walter Roberson
Walter Roberson on 25 Mar 2011

0 votes

What you see is normal: nested SPMD loops that are visible to the program only allocate workers at the outer level. One of the Mathworks people who works on the parallel programming facilities has posted indicating that what you can do is have the outer SPMD loop call a function, and inside the function have SPMD loops: the inner ones would then get their own workers.
Marco
Marco on 26 Mar 2011

0 votes

Thank you for the prompt reply Walter!
The way you described is the way I do it. Unfortunately, when I call the function
s = matlabpool('size');
in the function called by the outer SPMD loop, s = 0 always. Therefore the inner SPMD doesn´t have any worker to be deployed on and everything crashes. This happens even if at the beginning of the program I open a pool of 8 workers ans the first SPMD uses only 2, for example (there should be 6 free right?).
Thanks!
Marco
Jiro Doke
Jiro Doke on 26 Mar 2011
Nesting spmd or parfor does not work. At least for parfor, you can put it in a function as you are doing, but the inner parfor will simply behave like a regular for loop since the outer loop will use up all the workers you open. EDIT: spmd also works in an inner function, but it doesn't use any additional workers.
I believe the only way you can achieve nested parallelism is to use either matlabpooljob or batch. These will allow you to create multiple jobs that can each have matlabpool workers. So your "outer loop" (k-fold cross validation) can be the multiple matlabpooljobs or batch jobs, but specify desired number of workers for each. Note that batch will ultimately require one extra worker in addition to the number of workers you request (because it requires a virtual client worker). Also, the outer parallelism are independent jobs, so they don't talk to each other. If you need interactions between them, I think it will be extremely non-trivial. In your case, I would assume the k-fold cross validations can be set up to be k independent validation jobs, so that shouldn't be a problem.
Here's an example of how you would use matlabpooljob:
jm = findResource();
job(1) = createMatlabPoolJob(jm);
job(1).MaximumNumberOfWorkers = 4;
job(1).MinimumNumberOfWorkers = 4;
createTask(job(1), @trainingFcn, 1, {in1a, in2a});
job(2) = createMatlabPoolJob(jm);
job(2).MaximumNumberOfWorkers = 4;
job(2).MinimumNumberOfWorkers = 4;
createTask(job(2), @trainingFcn, 1, {in1b, in2b});
% submit the 2 jobs
for id = 1:2
submit(job(id))
end
for id = 1:2
waitForState(job(id), 'finished');
results{id} = getAllOutputArguments(job(id));
end
destroy(job);
The above example illustrates how you might create two jobs where each job requests 4 workers (for a total of 8 workers). Inside your function "trainingFcn" above, you can have spmd and parfor.

5 Comments

Jiro, the cure for the problem of not having any workers left is to use a different pool in the called function. You are still restricted in the _total_ number of workers you have, according to the license you have purchased, so do not have your outer pool use all of the licensed workers.
Walter, I don't understand what you are saying. Are you agreeing with my answer, disagreeing, or something else? Yes, the total number of workers is limited by the license you have. In my example, I assume I have access to 8, and I'm using a total of 8 (two sets of 4).
You *cannot* start a pool of workers on a worker. So opening a pool (matlabpool open) inside a function being run on a worker would not work. That's why I'm using a low-level usage of creating jobs, which themselves can use a matlabpool.
The other thing is that you can only have one matlabpool open at a time. So opening a pool inside a function would not work if there is one already open.
This is confusing in view of
http://www.mathworks.com/help/toolbox/distcomp/brukbnp-9.html#brukbnp-12
"The body of an spmd statement cannot contain another spmd. However, it can call a function that contains another spmd statement. Be sure that your MATLAB pool has enough workers to accommodate such expansion."
so if i have this program which use spmd and drange over distributed range (size of array a)
spmd
y=codedistributed(random(3,20),codedistributor1d(2));
a=codedistributed(random(1,20),codedistributor1d(2));
b=codedistributed(random(3,20),codedistributor1d(2));
for i=drange(1:20)
y(1,i)=user_defined_function(a(i),b (1,i));
y(2,i)=user_defined_function(a(i),b (1,i));
y(3,i)=user_defined_function(a(i),b (1,i));
end
end
The user_defined_functin contains 3 nested for loops and the function takes (75 seconds) and I want to speedup the function time by using another spmd.
function y=user_defined_function(a,b)
[ep,u,lam]=ndgrid(1e-3:1e-2:1,1e-3:1e-2:1,1e-3:1e-2:1);
for i=1:size(ep,3)
for j=1:size(ep,2)
for p=1:size(ep,1)
l1(p,j,i)=ep(p,j,i)+u(p,j,i)*a+sum(lam(p,j,i)*exp(-b));
end
end
end
By the documentation, I know I can use spmd inside another function. My question is if I call the function 3 times inside for-drange loop, will it effects the excution time of the function (from 75 seconds to 4 seconds)

Sign in to comment.

Categories

Find more on MATLAB Parallel Server in Help Center and File Exchange

Asked:

on 25 Mar 2011

Commented:

on 27 Sep 2020

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!