Sum rows that have duplicate values in 2 other rows

I have a 345,505 x 13 matrix of doubles. I need to sum the columns 2 and 4-13 of any rows that have the same value in column 1, and the same value in column 3, so that they combine into one row. I've already sorted them by columns 1 and 3.
Example with a 4x7 matrix:
oldMatrix
1 3 1 0 4 3
1 9 1 2 2 3
2 3 1 3 2 1
3 4 5 0 1 2
Since the first two rows match, they need to combine to one row:
newMatrix
1 12 1 2 6 9
2 3 1 3 2 1
3 4 5 0 1 2
Thanks for your help!

2 Comments

How do you get 9 in the last column of row #1? Why is it not 3+3=6? Is it possible that there may be any number of such rows, or is it always just 1 row or 2 rows?
I’m sorry, that should be 6. My brain’s a little fried. Yes it can be any number of rows that need to be combined, not just 1 or 2.

Sign in to comment.

 Accepted Answer

Possibly a simpler method, using the newish splitapply:
m = [1 3 1 0 4 3
1 9 1 2 2 3
2 3 1 3 2 1
3 4 5 0 1 2];
[values, ~, ids] = unique(m(:, [1 3]), 'rows'); %get unique id for rows with identical column 1 and 3
out = splitapply(@(rows) sum(rows, 1), m, ids); %sum rows with identical id
out(:, [1 3]) = values %replace summed column 1 and 3 by original value
edit: fixed sum to sum rows together, not columns!
Note that m does not need to be sorted, since unique will do it anyway (unless you use the 'stable' option)

3 Comments

Doing that, I got this error:
Error using splitapply (line 132) The function '@(rows)sum(rows,2)' returned a non-scalar value when applied to the 1st group of data.
To compute nonscalar values for each group, create an anonymous function to return each value in a scalar cell:
@(rows){sum(rows,2)}
I tried changing it to {sum(rows,2)}, but then I get this error:
Conversion to cell from double is not possible.
My brain must be going mushy. I tested my answer with the correct line
out = splitapply(@(rows) sum(rows, 1), m, ids);
But somehow changed the 1 to 2 when I actually typed it above.
Fixed now.
Thanks, I did figure that out. This worked!

Sign in to comment.

More Answers (1)

a = [1 3 1 0 4 3 ; 1 9 1 2 2 3 ; 2 3 1 3 2 1 ; 3 4 5 0 1 2]
[ua,~,j] = unique(a(:,[1 3]),'rows','stable')
c = arrayfun(@(k) sum(a(k==j,:),1),1:max(j),'un',0)
b = cat(1,c{:})
b(:,[1 3]) = ua

2 Comments

Thanks for the response. I will try it out, but I’m not sure I understand how the code works. Also, I realized my example should have been more clear. The 1st and 3rd columns would not have the same value as each other. Would this still work for that case?
Look at the documentation for each function to gain some understanding. The arrayfun code is a little bit tricky: it sums up each column for those rows that have identical rows when you look only at column 1 and 3 (so, yes!).

Sign in to comment.

Categories

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!