Clear Filters
Clear Filters

ptxas fatal : Unresolved extern function 'cudaGetPa​rameterBuf​ferV2' with matlab 2017a on GTX1080

4 views (last 30 days)
Hi,
I am having errors trying to use dynamic parallelism on my GTX1080 card. I have the cuda programs in .cu file and I compile and run from Matlab R2017a.
Call from Matlab:
system('nvcc child_kernel.cu parent_kernel.cu -dc -gencode=arch=compute_61,code=compute_61 -m64 -rdc=true -lcudadevrt -L"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\lib\x64" -ptx -Wno-deprecated-gpu-targets -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin"');
kernel_test = parallel.gpu.CUDAKernel('parent_kernel.ptx', 'parent_kernel.cu', 'parent_kernel');
kernel_test.ThreadBlockSize = [1,0,0];
kernel_test.GridSize = [1,0,0];
I get the following error:
Error using parallel.gpu.CUDAKernel
An error occurred during PTX compilation of <image>.
The information log was:
The error log was:
ptxas fatal : Unresolved extern function 'cudaGetParameterBufferV2'
The CUDA error code was: CUDA_ERROR_NO_BINARY_FOR_GPU.
Error in test_cuda (line 36)
kernel_test = parallel.gpu.CUDAKernel('test_cuda_fncall_frm_cuda.ptx', 'test_cuda_fncall_frm_cuda.cu',
'test_cuda_fncall_frm_cuda');
Device info:
Name: 'GeForce GTX 1080'
Index: 1
ComputeCapability: '6.1'
SupportsDouble: 1
DriverVersion: 8
ToolkitVersion: 8
cmd:
nvcc child_kernel.cu parent_kernel.cu -dc -gencode=arch=compute_61,code=compute_61 -m64 -rdc=true -lcudadevrt -L"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\lib\x64" -ptx -Wno-deprecated-gpu-targets -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin"

Accepted Answer

Joss Knight
Joss Knight on 15 Jul 2017
Edited: Joss Knight on 15 Jul 2017
Dynamic parallelism is not supported in MATLAB CUDAKernel objects. You need to use a MEX function instead. Sorry.

More Answers (2)

Pavel Sinha
Pavel Sinha on 18 Jul 2017
Thanks!
Is there any extra advantage writing a mex cuda wrapper compared to using CUDAKernel object, given I donot use Dynamic parallelism.
Also, while using CUDAKernel object, once the kernel objects are compiled, is there any delay in actual launching of the cuda kernels by matlab compared to the mex counterpart.
I have all the data loaded in GPU and then wish to launch a series of these CUDAKernels one after the other. Is there any advantage in terms of speed if I were to write 1 Mex function that calls 1 cuda kernel and then internally launches multiple cuda kernel one after the other. My prime objective is speed even 10% speed up would matter.

Pavel Sinha
Pavel Sinha on 18 Jul 2017
Also, I am using R1071a. Does the matlab convn use cuDNN? If not, then is there any convolution function in matlab that uses cuDNN functions or any wrapper function in matlab to use cuDNN?

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!