huge differences in single vs double precision math
2 views (last 30 days)
Show older comments
I am calculating a sum of squares in 32-bit FP precision (for comparison with a GPU algorithm, which isn't relevant here).
Here is the code:
Y=single((0:499).^2);
sum(Y)
ans =
41541684
sum(double(Y))
ans =
41541750
The (correct) double answer is off by 66! The largest value, 499^2 = 249001, is nowhere near any FP limits.
This is R2013A on OS X 10.9.
0 Comments
Answers (1)
John D'Errico
on 7 Aug 2014
What you don't understand is that single precision has a 23 bit mantissa. While there are 32 total bits stored in a single, don't forget that one of those bits is a sign bit, which leaves 8 bits to store an exponent in a biased form. So you cannot store an INTEGER larger than 2^24-1 in a single, if you wish to do so without error.
The sum you formed was larger than that limit, so you should expect an error.
log2(41541750)
ans =
25.308
It is time for you to start reading about floating point arithmetic.
Computers are not all powerful, except for those in the movies/tv.
0 Comments
See Also
Categories
Find more on Logical in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!