Is opening multiple Matlab engines slow?
Show older comments
I am running C++ code on a Linux server with 48 CPU's. In C++, I create up to 45 threads in a for loop using "#pragma omp parallel for". Each of the the threads create a pointer to a Matlab engine, pass in some variables, run a Matlab function, free up the memory, and close the engine.
If I run the snippet of code with a single thread, it takes <20 seconds to run. With 45 threads (ideally running on 45 separate CPU's), I would expect it to also take ~20 seconds to run. Instead, it takes ~10 minutes to run.
As I watch the gnome-system-monitor, I see bursts of activity when all cores are running at full throttle. (I've setup print statements to show that it is actually getting work done in Matlab as well.) But in between these bursts of activity, there are large swaths of time (i.e. minutes) where all CPU's report 0% activity.
In the C++ code below, I've set up print statements to see where the threads are at when they report 0% activity. Sometimes it happens during engOpen(). More often, it happens near engClose(). When the CPU's appear to hang, the first one might complete engOpen() after a few seconds whereas the last one might take several minutes. For engClose(), the first one might complete after a few seconds to several minutes. The last one can take up to 10 minutes. Throughout these "down times", nearly all the CPU's report 0% activity.
Why is it taking so long? Is there something inherently bad with opening up multiple Matlab engines?
A few other relevant notes:
- I open up multiple engines because the variable names used in the function are the same for each thread. If each thread used the same Matlab engine, their variables would conflict.
- The Matlab function is simple. It checks to see if a file exists. If it does, it opens it and checks to see if the data dimensions are correct. Then it returns.
- I've watched the RAM during this process. It's low (around ~10% full) so I don't suspect that is a problem.
- I've attempted to force Matlab to use a single thread (since I multithread in C++). I've tried doing this two different ways:
- I call engOpen("") with no input parameters and then put this line in the Matlab function: "maxNumCompThreads(1)"
- I call engOpen("matlab -nodesktop -nosplash -singleCompThread") and comment out the "maxNumCompThreads(1)" command in the Matlab function.
When I do this, I watch "top". I see the program spawn 45 Matlab programs...but interestingly, the "nTh" (numThreads) for each of these Matlab programs reports anywhere from a few to dozens (e.g. 5 - 60). It makes me wonder if those commands are working as intended.
Here's the C++ code snippet:
#pragma omp parallel for
for (int m = 0; m < numSimEdges; m ++){
Engine *ep; // matlab engine pointer
const char *startcmd = "matlab -nodesktop -nosplash -singleCompThread";
if (!(ep = engOpen(startcmd))) {
std::cout << "ERROR! Can't start MATLAB engine!" << std::endl;
}
// create matlab variables
mxArray *matlab_videoDir1 = mxCreateString(video_dir1.c_str());
mxArray *matlab_videoDir2 = mxCreateString(video_dir2.c_str());
mxArray *matlab_numSampledFrames = mxCreateDoubleScalar((double) numSampledFrames);
mxArray *matlab_top_k = mxCreateDoubleScalar((double) top_k);
mxArray *matlab_n_orientations = mxCreateDoubleScalar((double) n_orientations);
// place the variables into the matlab workspace
engPutVariable(ep, "videoDir1", matlab_videoDir1);
engPutVariable(ep, "videoDir2", matlab_videoDir2);
engPutVariable(ep, "numSampledFrames", matlab_numSampledFrames);
engPutVariable(ep, "top_k", matlab_top_k);
engPutVariable(ep, "n_orientations", matlab_n_orientations);
// evaluate a function
engEvalString(ep, "computed = check_similarity_between_videos(videoDir1, videoDir2, numSampledFrames, top_k, n_orientations)");
// get the computed variable
mxArray *matlab_computed = NULL;
if ((matlab_computed = engGetVariable(ep,"computed")) == NULL) {
std::cout << "Variable 'computed' doesn't exist in matlab session." << std::endl;
alreadyComputed = 0;
} else {
alreadyComputed = mxGetScalar(matlab_computed);
}
// Free memory
mxDestroyArray(matlab_videoDir1);
mxDestroyArray(matlab_videoDir2);
mxDestroyArray(matlab_numSampledFrames);
mxDestroyArray(matlab_top_k);
mxDestroyArray(matlab_n_orientations);
mxDestroyArray(matlab_computed);
// close MATLAB engine
//engClose(ep);
}
Here's the Matlab function:
function computed = check_similarity_between_videos(videoDir1, videoDir2, numSampledFrames, top_k, n_orientations)
% this sets the # of threads to 1 since we do the multithreading in c++
% maxNumCompThreads(1);
computed = 1;
[~, video1] = fileparts(videoDir1);
[root_dir, video2] = fileparts(videoDir2);
simi_dir = [root_dir, '/similarity'];
simi_file = [simi_dir, '/', video1, '__', video2, '.mat'];
% check if file exists
if ~exist(simi_file, 'file')
computed = 0;
return;
end
% load precomputed similarity
load(simi_file, 'simi');
% extract the dimensions
[numOrientationsClip1, numProposalsClip2, numProposalsClip1, numFramesClip2, numFramesClip1] = size(simi);
% check if dimensions are correct
if numFramesClip1~=numSampledFrames || numFramesClip2~=numSampledFrames || ...
numProposalsClip1 < top_k || numProposalsClip2 < top_k || ...
numOrientationsClip1<n_orientations
computed = 0;
end
end
5 Comments
Saurabh Gupta
on 1 Aug 2017
Hi Jared, I don't have an answer for your question as such, but a counter-question. Is there a specific reason for performing this task using multiple singe-threaded MATLAB instances instead of using a parfor loop in one multi-threaded MATLAB instance?
Jared Johansen
on 3 Aug 2017
Saurabh Gupta
on 4 Aug 2017
I see your point. It may be worth refactoring the code and make use parallelization implemented in MATLAB to run MATLAB functionality.
If you really want to control the execution from C/C++, one option you could explore is generating C/C++ code from MATLAB code using MATLAB Coder product, and then calling those functions directly. This will eliminate the need to invoke MATLAB in your use case.
Another alternative you could consider is running the MATLAB scripts/functions in "batch mode" instead of executing them after invoking individual MATLAB instances. The following posts may be helpful in this regard.
Mark Matusevich
on 7 Aug 2017
I don't have an experience with MATLAB engine, but I see similarities in your question with my own experience with MATLAB Compiler of R2009b.
MATLAB Compiler Runtime allows only 1 instance per process, each call from C code to MCR (i.e. call to my MATLAB function, mxCreateString, mxGetScalar, etc.) locks this instance and executes this command. Any command from different thread waits during this time to acquire the lock. This is true even if you have "Parallel Toolbox" license. I also see bursts of 100% CPU usage, which are due to a few MATLAB functions which have integrated default multi-threading implementation (e.g. matrix multiplication).
P.S.: You should check engOpenSingleUse...
Jared Johansen
on 7 Aug 2017
Answers (0)
Categories
Find more on Startup and Shutdown in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!