Matlab 2013a GPU memory leak
2 views (last 30 days)
Show older comments
I have been running some very long loops (millions of iterations) where , in each iteration, I call a few CUDA kernels via feval using pre-allocated arrays of fixed size. I noticed that the host memory grows linearly with the number of iterations and in the end matlab crashes. While I was trying to isolate the problem I found out the following: - Using feval to call a CUDA kernel , you have to have all the arguments of the function already cast as gpuArray's, even if you pass scalar variables. This also applies to functions like gpuArray.rand or randn:
n = 1e4;
for i = 1:1e6
out = gpuArray.rand(n,1,'single');
end
The above code causes the host memory to grow for the duration of the execution (about 100Mb per 250K iterations) If instead of n=1e4; you write n=gpuArray(1e4); the subsequent loop does not cause the memory to grow. I also found out the the above loop executes much faster when n is in the host memory vs. when n is a gpuArray (about 3 times faster).
-Even more puzzling is the following example:
x = gpuArray.rand(1e4,1,'single');
for i = 1:1e6
out = sqrt(x);
end
The above loop does not cause MATLAB's memory footprint to grow. However, if we change sqrt(x) with sqrt(1./x) then we get the memory blowup again. I am using MATLAB 2013a 64-bit on windows 7 professional. My video card is a gtx 650 2gb. Thanks in advance for any insights.
3 Comments
Ben Tordoff
on 4 Jun 2013
Thanks Michael, you are indeed right and this appears to be a bug introduced in R2013a. There is no realistic work-around I can provide right now, but I will post an update here once I have some more helpful suggestions.
The reason why the memory does not leak with certain calls is that they force a synchronisation event (in your first example, SQRT can error so has to wait to see if the error was hit; in the second the scalar parameter "n" has to be transferred back to host memory, which also causes a sync). You could achieve the same by inserting a "wait(gpu)" after every call:
gpu = gpuDevice();
for ii=1:1e8
out = gpuArray.rand(1e3,1,'single');
wait(gpu);
disp(i)
end
but that will also slow things down a lot and is hardly a practical solution.
Accepted Answer
Ben Tordoff
on 18 Jun 2013
Hi Michael, could you read the following bug-report and try the workaround it contains (being careful about the backing-up step!):
If this does not fix the problem, please let me know as soon as possible.
Ben
More Answers (0)
See Also
Categories
Find more on GPU Computing in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!