How to delete duplicate values in a column

Matrix A is as follows:
A = [10023 10024 10025 10026 10027
1 1 1 1 1
1 1 1 1 1
1 1 2 1 1
2 3 2 5 1
2 3 2 5 2
4 3 4 5 1
4 3 4 6 1
1 3 4 6 1
1 1 1 1 1];
I want to remove the duplicate number in each column and produce the new matrix B like following:
B = [10023 10024 10025 10026 10027
1 1 1 1 1
2 3 2 5 2
4 1 4 6 1
1 0 1 1 0];
% The zero are added to ID numbers 10024 & 10027 in order to keep the consistenty of matrix B dimension.

Answers (1)

This works:
A = [10023 10024 10025 10026 10027
1 1 1 1 1
1 1 1 1 1
1 1 2 1 1
2 3 2 5 1
2 3 2 5 2
4 3 4 5 1
4 3 4 6 1
1 3 4 6 1
1 1 1 1 1]
B = zeros(size(A));
for col = 1 : size(A, 2)
thisCol = A(:, col);
thisCol(diff(thisCol) == 0) = []; % Remove repeats.
B(1:length(thisCol), col) = thisCol;
end
% Trim off all zero rows
lastRow = find(all(B==0, 2), 1, 'first')-1;
B = B(1:lastRow, :)

8 Comments

Great!
Can you now categorize IDs based on the out of B matrix? For example:
T1 = [10023 10025
1 1
2 2
4 4
1 1
];
T2 = [10024
1
3
1
0];
T3 = [10026
1
5
6
1];
T4 = [10027
1
2
1
0];
% All IDs with sequence 1241 are added to the matrix T1 and so on...
Transpose B and pass it into unique() with the 'sortrows' option. Then use ismember() to find out where those rows occur in B and extract them. See if you can do it yourself.
Thanks for guiding me. I don't know how to set up ismember() function in order to extract them. Can you pls help
Try
[inA, inB] = ismember(A, B);
and see what inA and inB are. Try swapping A and B. I'm sure you can figure it out.
I couldn't make it. Can you help with code ?
OK, then why not simply do this
T1 = B(:, [1,3]);
T2 = B(:, 2);
T2 = B(:, 4);
T2 = B(:, 5);
But if matrix A has over 100 columns? Can you please help me on the second part of my question?
Well what did you try? Did you get anything like this:
A = [10023 10024 10025 10026 10027
1 1 1 1 1
1 1 1 1 1
1 1 2 1 1
2 3 2 5 1
2 3 2 5 2
4 3 4 5 1
4 3 4 6 1
1 3 4 6 1
1 1 1 1 1]
B = zeros(size(A));
for col = 1 : size(A, 2)
thisCol = A(:, col);
thisCol(diff(thisCol) == 0) = []; % Remove repeats.
B(1:length(thisCol), col) = thisCol;
end
% Trim off all zero rows
lastRow = find(all(B==0, 2), 1, 'first')-1;
B = B(1:lastRow, :)'
bRight = B(:, 2 : end)
B2 = unique(bRight, 'rows')
% Go down these rows finding out all the rows that have the row
for row = 1 : size(B2, 1)
thisRow = B2(row, :)
[ia, ib] = ismember(bRight, thisRow, 'rows')
extractedRows = B(ia, :)';
T{row} = extractedRows;
end
celldisp(T)
And you'll see:
T =
1×4 cell array
[5×1 double] [5×2 double] [5×1 double] [5×1 double]
T{1} =
10027
1
2
1
0
T{2} =
10023 10025
1 1
2 2
4 4
1 1
T{3} =
10024
1
3
1
0
T{4} =
10026
1
5
6
1

Sign in to comment.

Categories

Asked:

on 10 Dec 2016

Commented:

on 11 Dec 2016

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!