Parallelization of SVD on research clusters
6 views (last 30 days)
Hello MATLAB Community,
Currently, I am trying to perform Singular Value Decomposition of big datasets in MATLAB using svd() command. However, I encounter a problem with memory when forming and storing the matrices as indeed the datasets are of significant sizes (full flow fields of CFD simulations).
Luckily, I do have access to a research cluster with mutliple nodes (~200GB memory each or ~500GB in case of hi-mem nodes). I'd like to use 2 (or more in the future) nodes (i.e. 2x40 processors) to perform the SVD.
I am not too certain how to parallelize the SVD operation such that it can distribute workload over 80 workers using all memory of 2 nodes if needed. It is also important that the solution scales such that in the future I can increase number of nodes for bigger problems.
Can anyone in the community help me achieve this goal? Are there any resources (couldn't find any so far) by MathWorks on how to do this?
Many thanks in advance,
Raymond Norris on 2 Apr 2022
Building off of Kamil's suggestion to look at Edric's post, Edric suggests you ought to build the distributed array directly on the workers. In his example, he shows
D = rand(1000, 'distributed');
If however, you're not using rand (or any other helper function like ones, zeros, etc.), then you'll want to create the distributed array using the codistributed constructor, comprised of variant composite arrays. The advantage of codistributed arrays is that you can device your own schema of how to design the distributed array.
Sounds like you have 2 nodes (for now), each with 40 cores. Something to consider is the performance of 80 workers vs just 2 workers, but with one worker per node. In most schedulers, you can tailor the job submission. For example
% Create your pool of workers
cluster = parcluster('myScheduler');
cluster.AdditionalProperties.ProcPerNode = 1;
cluster.AdditionalProperties.ExclusiveNode = true;
pool = cluster.parpool(2);
The AdditionalProperties is a bit of psuedo code and would need to be added and coded in your cluster object. For information on adding properties, contact Technical Support (firstname.lastname@example.org). For the sake of discussion, I'm also assuming you're not using MJS, otherwise getting workers to run on two distinct nodes would be a bit different.
Next build up your local parts of the SVD and the calcuate. In this case, I am using rand to generate the data, but this might be data read from images, etc.
% Build scheme of distributed array A
N = 1000;
j = getCurrentJob;
% Each worker will get one column vector (in essence N^2 x 2)
workers = numel(j.Tasks);
globalSize = [N^2, workers];
codistr = codistributor1d(2, codistributor1d.unsetPartition, globalSize);
% Create local variant. Using rand, but this might be data read from
% an image file.
local_a = rand(N);
% Reshape to be a column vector
local_a = local_a(:);
% Stitch local parts together to create distributed array A
A = codistributed.build(local_a, codistr);
% Calculate svd
[U, S] = svd(A);