Using the Parallel Computing Toolbox for simultaneous evaluation of multiple Datasets

5 views (last 30 days)
Hello,
I am very new to the parallel toolbox and don't know how to run my code with it efficiently. I'll explain my situation.
I want to evaluate different results of image registration on different datasets, which are all stored in a common folder.
Right now I simply loop through the folder, and apply my function "Evaluation(x,y)" to every file in the folder serially, and save the result into a structure. In the end I add all these structures to one big structure since they have the same fields.
Inside my Evaluation function I call different functions, which calculate different things, which some of them are independent of another.
See the pseudo code.
for folder in listing...
%loop through folder and get the paths for the evaluation function
load(folder)...
path_fixed = [input_folder_dir_(8).folder,'\', input_folder_dir_(8).name];
res = Evaluation(path_fixed, path_moving, path_registerd, path_displacement, LMs);
res.name = listing(i).name;
struc_name = listing(i).name;
results.(struc_name) = res;
end
results_overview = [results.strucnameA, results.strucnameB,...]
function res = Evaluation(a,b,c,d)
res.Parameter_a = evaluate_metric1(a,b);
res.Parameter_b = evaluate_metric2(b,c,d);
.
.
.
end
So as far as I read into parallel computing I have 2 options:
I either can use a parfor loop for looping through the folder. This means that 2-4 datasets are evaluated at the same time. However, I don't know if that works, since both workers would then write the same variable 'res', right?
Otherwise I could use spmd blocks in the evaluation function itself to calculate multiple parameters at the same time.
Which would be the correct way?
Does doubling the workers also mean double the ram memory needed?
Thanks and regards
Michael

Accepted Answer

Thomas Falch
Thomas Falch on 2 Jul 2021
Hi Michael,
Using a parfor loop would be a good solution to this problem. Local variables like "res" and "path_fixed" inside parfor loops are not a problem, each worker will have an independent copy. The sturct "results" on the other hand is a problem, it will, in a way, be shared between the workers, and parfor loops cannot write their results to shared structs. They can, however use arrays, and in most cases it is easy to rewrite the code to do so. In your case it would be something like:
parfor i = 1:100
res = i; % Actual computation here
name = sprintf("name_%d", i); % Actual name here
s = struct("Name", name, "Value", res);
a(i) = s;
end
[a.Name]
ans = 1×100 string array
"name_1" "name_2" "name_3" "name_4" "name_5" "name_6" "name_7" "name_8" "name_9" "name_10" "name_11" "name_12" "name_13" "name_14" "name_15" "name_16" "name_17" "name_18" "name_19" "name_20" "name_21" "name_22" "name_23" "name_24" "name_25" "name_26" "name_27" "name_28" "name_29" "name_30" "name_31" "name_32" "name_33" "name_34" "name_35" "name_36" "name_37" "name_38" "name_39" "name_40" "name_41" "name_42" "name_43" "name_44" "name_45" "name_46" "name_47" "name_48" "name_49" "name_50" "name_51" "name_52" "name_53" "name_54" "name_55" "name_56" "name_57" "name_58" "name_59" "name_60" "name_61" "name_62" "name_63" "name_64" "name_65" "name_66" "name_67" "name_68" "name_69" "name_70" "name_71" "name_72" "name_73" "name_74" "name_75" "name_76" "name_77" "name_78" "name_79" "name_80" "name_81" "name_82" "name_83" "name_84" "name_85" "name_86" "name_87" "name_88" "name_89" "name_90" "name_91" "name_92" "name_93" "name_94" "name_95" "name_96" "name_97" "name_98" "name_99" "name_100"
As for the memory, each worker is really just a regular MATLAB running in the background, so doubeling the number of workers will aproximately double the total memory usage.
  1 Comment
Michael Werthmann
Michael Werthmann on 5 Jul 2021
Hi Thomas,
thank you very much for your kind answer and your insights. In the meantime, I had the chance to try different possibilities and came to the conclusion, that the easiest way to speed up my program was to parallel compute a bottle neck function inside the "Evaluation" function. Further, I divided my data and processed each of them in an own batch job. I got a huge performance boost with that.
I tried using the parfor as you suggested but I got exactly these errors with the shared structs you mentioned.
My thesis is due in a few weeks, so I won't have the time to check your solution myself, but thanks anyway :)
Kind Regards
Michael

Sign in to comment.

More Answers (0)

Categories

Find more on Parallel for-Loops (parfor) in Help Center and File Exchange

Products


Release

R2018a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!