gpuArray slower on newer graphics card in double precision

1 view (last 30 days)
I have been making the following speed test in R2015a on two different computers running two different graphics cards,
>> A=gpuArray(rand(5e3));
>> T=gputimeit(@()A*A)
The first computer is an older model (Dell Precision T7500) running an older graphics card (GTX 580). The second, newer computer (Dell Precision Tower 7910) is running a newer graphics card (Titan X).
Oddly, I find that the older configuration outperforms the newer by about 20%. The GTX 580 gives T=1.1178 seconds, whereas the Titan X gives T=1.3097 seconds. When I redo the test in single precision,
>> A=gpuArray(rand(5e3,'single'));
>> T=gputimeit(@()A*A)
the results are more in line with my expectations. The GTX 580 gives T=0.2121 seconds, whereas the Titan X gives T=0.0491 seconds.
I'm wondering what could account for this difference. One thing that might be worth mentioning is that the Titan X is not using a fully updated driver. At the time of this writing, there is some bug in its newest driver release, making it unusable, and I am instead using driver version 353.62. Could this be the reason? If not, any other ideas?
  7 Comments
Matt J
Matt J on 3 Aug 2015
Edited: Matt J on 3 Aug 2015
Brendan's response does indeed look like an answer, and is supported by this article so, Brendan, if you resubmit as an Answer, I will accept.
Ultimately, though, my computationally intensive work will mainly be single precision. I was just curious about the behavior I was seeing, and whether it might be due to a bad driver. So, I don't know if "the Titan X is a terrible card" is applicable to me.

Sign in to comment.

Accepted Answer

Brendan Hamm
Brendan Hamm on 3 Aug 2015
The Titan X is a terrible card to use for double precision GPGPU as it was designed as a cheaper alternative to other Titans with a focus on single precision (gaming). You will see that the GFLOPS for double precision is about 1/32 that of single precision on the Maxwell chips. Compare that with the Fermi architecture used on the GTX 580 which has 1/5 the GFLOPS for double precision compared with its single precision. If you intended to use this for double precision I would highly recommend using the Titan Z (or Black) which uses the Kepler architecture. Therefore if you have a Titan Black, this would not be rolling back at all, but rather using a card which considered double precision as being important.
  1 Comment
Brendan Hamm
Brendan Hamm on 3 Aug 2015
Edited: Brendan Hamm on 3 Aug 2015
More info can be found here as well: NVidia Comparisson Wiki.
For single precision work, the Titan X is the card to use, so looks like you made a good choice. It does have less cores than the Titan Z, but a higher clock rate and a lower price point.

Sign in to comment.

More Answers (0)

Categories

Find more on Language Fundamentals in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!