N dimensional multiplication on gpu arrays

6 views (last 30 days)
Is there a better way to compute N dimensional multiplication faster on gpu arrays? I am trying to write a custom deep learning layer which works with gpu array data. However this calculation takes too long for the layer to be meaningfull. Is there a way to do element wise multiplication like this one better on gpu arrays?
A=gpuArray(rand(20,20,16,1000));
B=gpuArray(rand(16,1000));
Z=zeros(size(A),'like',A);
tic
for i=1:16
for j=1:1000
Z(:,:,i,j)=A(:,:,i,j).*B(i,j);
end
end
toc
  1 Comment
ahmet keles
ahmet keles on 7 Dec 2022
I have found a clever way to do it. Convert the small matrix into N dimensional array and multiply.
A=gpuArray(rand(20,20,16,10));
B=gpuArray(rand(16,10));
C=Matrisize(B,A);
Z=A.*C;
function [C]=Matrisize(X1,X2)
S=size(X2);
C=zeros(size(X2),'like',X2);
for i=1:S(1)
for j=1:S(2)
C(i,j,:,:)=X1;
end
end
end

Sign in to comment.

Accepted Answer

C B
C B on 7 Dec 2022
It is possible to improve the performance of the code you have provided by using the built-in element-wise multiplication operator .* in MATLAB, instead of the loop you are currently using. This operator can automatically take advantage of the parallel processing capabilities of the GPU, allowing for faster computation.
Here is an example of how you can use .* to perform element-wise multiplication of the A and B arrays on the GPU:
% Create gpuArray arrays
A = gpuArray(rand(20,20,16,1000));
B = gpuArray(rand(16,1000));
% Perform element-wise multiplication using .*
Z = A .* B;
Using this approach, the multiplication operation will be performed in parallel on the GPU, which should provide a significant performance improvement compared to using a loop.
Alternatively, you can use the built-in times function to perform element-wise multiplication on the GPU. This function has the same effect as the .* operator, but is written as a function rather than an operator. Here is an example of how to use times to perform element-wise multiplication on the GPU:
% Create gpuArray arrays
A = gpuArray(rand(20,20,16,1000));
B = gpuArray(rand(16,1000));
% Perform element-wise multiplication using times
Z = times(A, B);
Both of these approaches should provide significant performance improvements compared to using a loop to perform element-wise multiplication on the GPU.
  2 Comments
ahmet keles
ahmet keles on 7 Dec 2022
Arrays have incompatible sizes for direct element wise multiplication thats the problem
Joss Knight
Joss Knight on 17 Dec 2022
B = gpuArray(rand(1,1,16,1000)) is what you need here.

Sign in to comment.

More Answers (0)

Products


Release

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!