Can I speed up this matrix multiplication?
Show older comments
Sorry for this stupid question. I know that the matrix multiplication has been highly optimized by Matlab, but I am still want to figure out that can I further speed up only for this special case?
I have no idea for this bottleneck in my code. The simplified code is uploaded in the following.
% X, Q are given symmetric matrix
[P, ~] = eig(X);
ans = P*(Q.*(P'*Y*P))*P'; % Y will change in each loop.
The X, Y, and Q are 1000x1000 matrices. In each loop, I have to compute four matrix multiplications,
But actually, P and Q are constant matrices in each loop, so I want to know what I could do to further speed up this code?
I have tried gpuArray, but it cannot be faster than normal Matlab * operation (maybe my GPU is not good enough..)
Any suggestion will be appreciated.
6 Comments
Jan
on 7 May 2021
Please provide some usual inputs. What are the typical sizes of X and Y? It matters, if they are 10x10 or 1e4x1e4 matrices.
Zenan Li
on 7 May 2021
Matt J
on 8 May 2021
Is Q sparse?
Zenan Li
on 10 May 2021
Matt J
on 10 May 2021
Is Q low rank, by any chance?
Zenan Li
on 10 May 2021
Accepted Answer
More Answers (1)
I have tried gpuArray, but it cannot be faster than normal Matlab * operation (maybe my GPU is not good enough..)
Depending on your GPU, and your precision requirements, it may be worth pre-casting the matrices to single precision. Some GPUs are not well-optimized for double float operations. For example, on my machine with the GeForce GTX 1050,
dtype={'double','single'};
for i=1:2
[P,Q,Y]=deal(rand(1000,dtype{i}));
timeit(@() P*(Q.*(P'*Y*P))*P')
[P,Q,Y]=deal(gpuArray.rand(1000,dtype{i}));
gputimeit(@() P*(Q.*(P'*Y*P))*P.')
end
I obtain,
ans = %double CPU
0.0607
ans = %double GPU
0.1678
ans = %single CPU
0.0261
ans = %single GPU
0.0065
13 Comments
Zenan Li
on 8 May 2021
Get a different graphics card? Here's the same test when running on the GTX 1080 Ti. Obviously, this card is better optimized for double precision, and the speed-up for the double precision GPU case is significant:
ans =
0.0872
ans =
0.0321
ans =
0.0451
ans =
0.0012
Thus the single precision computation with GPU acceleration can only support in the first few loops.
Have you really tested that?
Zenan Li
on 8 May 2021
Matt J
on 8 May 2021
So double precision computation by GPU is slower than by using CPU
You must have an exceptionally powerful CPU... What are the times you see when you run my test?
Zenan Li
on 10 May 2021
Zenan Li
on 10 May 2021
Matt J
on 10 May 2021
No. The problem is Q.
Zenan Li
on 10 May 2021
Well, for example, if Q=diag([1,0,0,....0]), then the computation reduces to
p=P(:,1);
ans=(p.'*Y*p)*(p*p.')
which is much easier.
Zenan Li
on 10 May 2021
Zenan Li
on 10 May 2021
Categories
Find more on Creating and Concatenating Matrices in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!
