Sum elements in a column where there are matching values in an adjacent column?

I have a numeric matrix A = [1 1 100; 1 2 200; 1 3 50; 2 1 100; 2 2 200; 2 3 50; 3 1 100; 3 2 200; 3 3 50; 3 4 20]
I am trying to obtain a vector of values of equal dimension e.g. ans = [0; 0; 0; 100; 200; 50; 200; 400; 100; 0]
My aim is to sum & average prior elements in the 3rd column where the corresponding element in the 2nd column value matches, not including the row each vector element corresponds to. I hope I have made this question clear, but please don't hesitate to ask for clarification if not. I have lots of rows to process hence a time efficient solution would be greatly appreciated.
Note: I have tried my best to find a solution from scanning through the previous Q & As but haven't had any luck so far.

 Accepted Answer

Here's one way, there are probably better ones, but this works:
NEW
A = [1 1 100; 1 2 200; 1 3 50; 2 1 100; 2 2 200; 2 3 50; 3 1 100; 3 2 200; 3 3 50; 3 4 20];
[~,~,idx] = unique(A(:,2)); % Unique second column indices
A3 = A(:,3); % third column values
% Build array with index and cumulative value
idxdv = accumarray(idx,(1:numel(idx))',[],@(x){[sort(x) cumsum(A3(sort(x)))]});
idxdv = cell2mat(cellfun(@(x)[x(:,1), [0;x(1:end-1,2)]],idxdv,'UniformOutput',false))
% Sort it by index and extract second column
dv = sortrows(idxdv);
results = dv(:,2)
OLD, for reference in comments
A = [1 1 100; 1 2 200; 1 3 50; 2 1 100; 2 2 200; 2 3 50; 3 1 100; 3 2 200; 3 3 50; 3 4 20];
[~,~,idx] = unique(A(:,2)); % Unique second column indices
A3 = A(:,3); % third column values
% Build array with index and cumulative value
idxdv = cell2mat(accumarray(idx,(1:numel(idx))',[],@(x){[sort(x) cumsum([0;A3(sort(x(1:end-1)))])]}));
% Sort it by index and extract second column
dv = sortrows(idxdv);
results = dv(:,2)

5 Comments

Thanks Sean, really appreciate you getting back to me so promptly. Your answer has inspired me to learn more about vectorisation, thanks again.
Hi Sean, having tried your answer on some different data I have realised that it isn't working as expected and I am trying to work out how all the code works to create idxdv. For example new data B = [1 1 20; 1 2 40; 1 3 60; 1 4 80; 2 1 30; 2 2 50; 2 3 50; 2 4 90; 3 1 40; 3 2 60; 3 3 80] gives the following answer [ 0; 0; 0; 0; 30; 40; 60; 80; 70; 100; 140] when should it should be [0;0;0;0;20;40;60;80;50;90;110] Any further insights you are able to share are greatly appreciated, especially how @(x){[sort(x) cumsum([0;A3(sort(x(1:end-1)))])]} works ?
The problem seems to be that accumarray, at least in 2015b, does not seem to preserve the order the elements appear in for each subs. For example, if you run:
accumarray(B(:, 2), B(:, 3), [], @(x) {x});
The first cell array, for subs == 1, is {30; 40; 20}. For some reason, the first element is at the tail and this break Sean's method. I'm not sure if it's a bug or simply undocumented behaviour that should never have been relied on. Sean, being a Mathworker should be able to clarify.
When all else fail there's always a loop. See my answer.
It's a bug in my logic, I'm sorting n-1 elements of x where x is unordered where I should be keeping n-1 elements after x is sorted.
This should be fixed by breaking the anonymous function into two steps. See NEW above.
Thank you so much for sharing your wisdom on this challenge and helping me get to the bottom of it, tis deeply appreciated and I have learnt a lot.

Sign in to comment.

More Answers (1)

Since accumarray appears to be doing some weird reordering, I'm not sure it can be use to do what you want. No matter, a loop always works. It's probably also easier to understand:
B = [1 1 20; 1 2 40; 1 3 60; 1 4 80; 2 1 30; 2 2 50; 2 3 50; 2 4 90; 3 1 40; 3 2 60; 3 3 80]
out = zeros(size(B, 1), 1);
for v = unique(B(:,2))'
rows = B(:, 2) == v;
sums = cumsum([0; B(rows, 3)]);
out(rows) = sums(1:end-1);
end

Categories

Asked:

on 31 Dec 2015

Edited:

on 6 Jan 2016

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!