Can the following function be optimized further for speed?

4 views (last 30 days)
I am computing the Pearson correlation coefficient for a sample using the following function.
function coef = corr_(x, y)
% Copyright 1984-2018 The MathWorks, Inc.
% Modified by Seth Bashford, 2019.
n = size(x, 1)
x = x - sum(x, 1)/n;
y = y - sum(y, 1)/n;
coef = (x.' * y) / (vecnorm(x) * vecnorm(y));
The input arguments x and y are of class double. Each have size [n, 1].
The following image shows profile data on my machine: 3.5 GHz Intel Core i7, 16 GB 1600 MHz DDR3. I am using MATLAB R2019a. The calls were executed in the context of another function, and for each call n == 10.
Screen Shot 2019-08-09 at 10.51.24 AM.png
The average time per call is about 1.63 us. What do you think - can the function be further optimized for speed?
  3 Comments
Bruno Luong
Bruno Luong on 9 Aug 2019
@Geoff: This kind of incoherence (or obscure discrepency) appears often with the profiler.
dpb
dpb on 9 Aug 2019
Edited: dpb on 9 Aug 2019
Doubt it'll make any difference, but why did you write sum(x|y,1)/n instead of mean(x|y)? Would save the temporary n otherwise not needed...again a nit, but "curious minds ..." and all that.
Also, while again probably won't make a difference, could see if norm is any better than normvec which handles arrays differently...the front end overhead diffierence shouldn't be much of a fraction, but ya' never knows unless ya' tries...
And, I guess while trying to micro-optimize, can always see if sum(x.*y) can compete with the matrix multiply.
Geoff's Q? -- wonder if has something to do w/ profiler caching the function call initially? Is it reproducible difference for multiple runs and if so, if reverse the order of the two calls, is the discrepancy still higher time first?
ADDENDUM:
Even more micro...can/will array of two columns beat calling function twice't?
function coef = corr_(x, y)
xy=[x y]-mean([x y]);
...
If JIT optimizer can't figure out to only do the catenation once, probably not...one could build the temporary z=[x y]; but that also is probably more overhead lost than savings gained. But again, the "ya' never knows what ya' never asks!" Some of these might be size dependent, too...

Sign in to comment.

Accepted Answer

the cyclist
the cyclist on 9 Aug 2019
Edited: the cyclist on 9 Aug 2019
The vecnorm function probably does some error- and type-checking which you don't need, and you can combine two square root operations into one. This might be faster:
coef = (x.' * y) / sqrt(sum(x.^2) * sum(y.^2));
Given that this code is no longer hinged to vecnorm, could you then parallelize your calls? For example,
N = 20000000;
n = 10;
% Made-up data, now with all nx1 vectors combined into an array
x = rand(n,N);
y = rand(n,N);
c = corr_(x,y);
where the modified correlation function is now
function coef = corr_(x, y)
% Copyright 1984-2018 The MathWorks, Inc.
% Modified by Seth Bashford, 2019.
% Modified by thecyclist, 2019
n = size(x, 1);
x = x - sum(x, 1)/n;
y = y - sum(y, 1)/n;
coef = sum(x .* y) ./ sqrt(sum(x.^2) .* sum(y.^2));
end
That gives a 10x speedup for me.

More Answers (0)

Categories

Find more on Parallel Computing Fundamentals in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!