imgaussfilt3 on the GPU with very large data

I've got a large number of 3D data sets of approximately 1300x1300x2160 uint16 numbers that I wish to apply a spatial Gauss filter to, using sigma = 2 (a 9x9x9 kernel). My first implementation that conserves RAM by splitting the data into blocks is structured as follows:
for k = 1:blocks
% Some code that generates correct indices here
prefilterblock = data(:,:,indices);
GaussBlock = imgaussfilt3(prefilterblock, sigma, 'FilterDomain', 'spatial');
FiltStack(:,:,indices) = GaussBlock;
end
This implementation spends about 2 minutes performing the calculations for one block. Now, I have a gut feeling that this could be sped up a good deal with GPU computing, but the data size means that I can't just call gpuArray(prefilterblock), since this is too much for the GPU memory. Furthermore, since I want to filter in 3D I can't find good examples even after a lot of googling and searching here.
.
So in the end, my question becomes how do I split up my data so that the filtering can run on the GPU?
-------------------------------------------------------------
The device info, in case it helps, is as follows:
Name: 'GeForce GTX 960'
Index: 1
ComputeCapability: '5.2'
SupportsDouble: 1
DriverVersion: 9
ToolkitVersion: 8
MaxThreadsPerBlock: 1024
MaxShmemPerBlock: 49152
MaxThreadBlockSize: [1024 1024 64]
MaxGridSize: [2.1475e+09 65535 65535]
SIMDWidth: 32
TotalMemory: 4.2950e+09
AvailableMemory: 3.5071e+09
MultiprocessorCount: 8
ClockRateKHz: 1253000
ComputeMode: 'Default'
GPUOverlapsTransfers: 1
KernelExecutionTimeout: 1
CanMapHostMemory: 1
DeviceSupported: 1
DeviceSelected: 1

Answers (0)

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!