Can the following function be optimized further for speed?
4 views (last 30 days)
Show older comments
Seth Bashford
on 9 Aug 2019
Commented: Seth Bashford
on 12 Aug 2019
function coef = corr_(x, y)
% Copyright 1984-2018 The MathWorks, Inc.
% Modified by Seth Bashford, 2019.
n = size(x, 1)
x = x - sum(x, 1)/n;
y = y - sum(y, 1)/n;
coef = (x.' * y) / (vecnorm(x) * vecnorm(y));
The input arguments x and y are of class double. Each have size [n, 1].
The following image shows profile data on my machine: 3.5 GHz Intel Core i7, 16 GB 1600 MHz DDR3. I am using MATLAB R2019a. The calls were executed in the context of another function, and for each call n == 10.
The average time per call is about 1.63 us. What do you think - can the function be further optimized for speed?
3 Comments
Bruno Luong
on 9 Aug 2019
@Geoff: This kind of incoherence (or obscure discrepency) appears often with the profiler.
dpb
on 9 Aug 2019
Edited: dpb
on 9 Aug 2019
Doubt it'll make any difference, but why did you write sum(x|y,1)/n instead of mean(x|y)? Would save the temporary n otherwise not needed...again a nit, but "curious minds ..." and all that.
Also, while again probably won't make a difference, could see if norm is any better than normvec which handles arrays differently...the front end overhead diffierence shouldn't be much of a fraction, but ya' never knows unless ya' tries...
And, I guess while trying to micro-optimize, can always see if sum(x.*y) can compete with the matrix multiply.
Geoff's Q? -- wonder if has something to do w/ profiler caching the function call initially? Is it reproducible difference for multiple runs and if so, if reverse the order of the two calls, is the discrepancy still higher time first?
ADDENDUM:
Even more micro...can/will array of two columns beat calling function twice't?
function coef = corr_(x, y)
xy=[x y]-mean([x y]);
...
If JIT optimizer can't figure out to only do the catenation once, probably not...one could build the temporary z=[x y]; but that also is probably more overhead lost than savings gained. But again, the "ya' never knows what ya' never asks!" Some of these might be size dependent, too...
Accepted Answer
the cyclist
on 9 Aug 2019
Edited: the cyclist
on 9 Aug 2019
The vecnorm function probably does some error- and type-checking which you don't need, and you can combine two square root operations into one. This might be faster:
coef = (x.' * y) / sqrt(sum(x.^2) * sum(y.^2));
Given that this code is no longer hinged to vecnorm, could you then parallelize your calls? For example,
N = 20000000;
n = 10;
% Made-up data, now with all nx1 vectors combined into an array
x = rand(n,N);
y = rand(n,N);
c = corr_(x,y);
where the modified correlation function is now
function coef = corr_(x, y)
% Copyright 1984-2018 The MathWorks, Inc.
% Modified by Seth Bashford, 2019.
% Modified by thecyclist, 2019
n = size(x, 1);
x = x - sum(x, 1)/n;
y = y - sum(y, 1)/n;
coef = sum(x .* y) ./ sqrt(sum(x.^2) .* sum(y.^2));
end
That gives a 10x speedup for me.
More Answers (0)
See Also
Categories
Find more on Parallel Computing Fundamentals in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!