Clear Filters
Clear Filters

Calculate the mean in a single operation with matrices

2 views (last 30 days)
Given 2 matrices M and P: I want to calculate the weighted geometric mean. For a single column vector p i would do like these:
M = rand(3,4);
P = (perms(1:3))';
p_single = P(:,1); % And iterate with a for loop through P matrix columns
result = prod(M.^p_single).^(1/sum(p_single)); % This gives me a single line
I want to get Result matrix where each row represents a Mean vector with the corresponding column from P matrix.
How can I do this without iteration in a single or few lines of code ?
EDIT (rewrote the complete loop):
M = rand(3,4);
P = (perms(1:3))';
[~,sz_C] = size(P);
result = [];
for k = 1: sz_C
p_single = P(:,k);
mean_single = prod(M.^p_single).^(1/sum(p_single));
result = [result; mean_single];
end
This code appends to the 'result' matrix every calculated geometric mean combination for P columns.
Is it possible to make it one liner or to eliminate the loop for speed (in case of big data set) ?
  4 Comments
dpb
dpb on 28 Apr 2018
Edited: dpb on 28 Apr 2018
Preallocate and populate instead of augment...since the sizes of M and P aren't commensurate in that size(result) is
size(P,2) x size(M,2)
where P,M aren't directly commensurate since size(P,2)-->factorial(M,1). This means if M gets very big you're going to need a whole bunch of memory real soon now...just the output of perms for N is
>> factorial(10)*8/1024/1024
ans =
27.685546875
>> factorial(11)*8/1024/1024
ans =
304.541015625
>>
Marco Salvo Cottone
Marco Salvo Cottone on 28 Apr 2018
Edited: Marco Salvo Cottone on 28 Apr 2018
M is static data it will stay small or at least not that big, what will really grow here (when I refer to big data set) is W matrix here named P and the Result matrix will be as you wrote size(P,2) x size(M,2).
I agree I can preallocate:
result = zeros(size(P,2), size(M,2));
But still the for loop problem remains.

Sign in to comment.

Answers (1)

dpb
dpb on 28 Apr 2018
mngeo=geomean(M.*W);
where M, W are the matrix and associated weights matrix
  7 Comments
dpb
dpb on 28 Apr 2018
"Currently the calculation is too slow because of the loop."
I don't think it's the loop itself that is the bottleneck nearly as much as that you're augmenting the output array every pass instead of preallocating and filling. If it really is calculating what you want and you're absolutely sure of that, fix the above and then see if it isn't adequate--loops are much more well-optimized now in ML than in years gone by but the reallocation within the loop is a real killer.
I'll have to come back and read up a little more on the actual computation; sorry I don't have the time at the moment to do more than just make a quick comment.
Marco Salvo Cottone
Marco Salvo Cottone on 28 Apr 2018
Edited: Marco Salvo Cottone on 28 Apr 2018
No I tested, it's pretty slow, even after preallocation.
Preallocation:
result = zeros(size(P,2), size(M,2));
Copying the data:
result(k,:) = mean_single;
It's the huge number of loops the problem.
Thanks for your efforts, are much appreciated.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!