A (possibly compact) way to find (1) unique rows, when rows have switched elements, (2) rows with switched elements, and (3) indices of rows with switched elements

1 view (last 30 days)
I have a 2-column matrix and I would need to find the following quantites, possibly in a compact way:
  1. unique rows, even when the two elements of rows are switched among each other ("u"). With "switched among each other" I mean, for example, that the row ''1 2" is equivalent to the row "2 1".
  2. rows with switched elements ("s").
  3. indices of rows with switched elements ("is").
Example:
% input: a 2-column matrix
a = [1 2
4 5
5 1
2 1
1 2
5 2
5 1
1 5
2 9
5 1]
% desired output (1): unique rows
u =
1 2
1 5
2 9
4 5
5 2
% desired output (2): rows with switched elements
s =
1 2 % equivalent to "2 1"
1 5 % equivalent to "5 1"
% desired output (3a): indices in matrix "a" indicating the row "1 2" (equivalent to "2 1")
is(1) =
1
4
5
% desired output (3b): indices in matrix "a" indicating the row "1 5" (equivalent to "5 1")
is(2) =
3
7
8
10
Any suggestions?
Obviously, when the elements of the rows are switched, the unique function does not provide such features (right?):
u = unique(a,'rows')
1 2
1 5
2 1
2 9
4 5
5 1
5 2
  4 Comments
Sim
Sim on 13 Jun 2022
Thanks for your comment @Jan!
I try to explain further / better and sorry for my unclear explanations.. :-)
First comment related to "It is still unclear to me, why you pick [1 5] and not [5, 1] as result, when the input contains [5, 1; 1, 5], "
Well, in my specific case I have a directed network. Here, rows [1 5] and [5, 1] refer to the same edge of the network, but to two opposite directions. If I have only the row [1 5] in my network, and not [5, 1], it means that my "information" flows only in that direction, i.e. from node 1 to node 5, and not from node 5 to node 1.
Therefore, if I only have the row [5 2] in my matrix, and not the row [2 5], as in my previous example/comment:
a = [...
...
...
...
...
5 2
...
...
...
...
];
I would need to keep only the row [5 2] in my list of unique rows, that I called "u".
Instead, If I have rows with switched / flipped elements, as [1 5] and [5 1]
a = [...
...
5 1
...
...
...
5 1
1 5
...
5 1];
then the order of elements is not important in the list of unique rows called "u", and I can keep either [1 5] or [5 1], as long as the same row (either [1 5] or [5 1]) is included in the array that I named of switched elements "s", and in the array that I named of indices of rows with switched elements, "is".
Second comment related to "why [2, 9] appears before [4, 5] in the output."
If we observe the array of unique rows "u"
u =
1 2
1 5
2 9
4 5
5 2
the first column is in ascending order (1, 1, 2, 4, 5), therefore the row [2 9] appears before the row [4 5], since the number "2" occurs before than "4"...
Hope this could clarify a bit more this thread... !! More clear now ?
:-)

Sign in to comment.

Accepted Answer

Sim
Sim on 9 Jun 2022
Edited: Sim on 9 Jun 2022
This is my solution and I have created the corresponding File Exchange if interested!
a = [1 2
4 5
5 1
2 1
1 2
5 2
5 1
1 5
2 9
5 1]
% (1) get the unique rows in "a" (obviously, here, the rows with switched elements are detected as different rows)
u = unique(a,'rows');
% (2) get the rows with switched elements
j = 1;
s = [];
for i = 1 : size(u,1)
idx = [];
[~,idx] = ismember(u,u(i,[2 1]),'rows');
if ~all(idx==0) % if the row of "u" with swtiched elements is present inside "u"
if isempty(s) % if the "s" array is empty
s(j,:) = u(idx~=0,:);
j = j + 1;
else % if the "s" array is not empty
if ~ismember(u(idx~=0,:),s,'rows') && ~ismember(fliplr(u(idx~=0,:)),s,'rows') % if the row of "u" is not already present in "s" (also with switched elements)
s(j,:) = u(idx~=0,:);
j = j + 1;
end
end
end
end
% (3) remove the rows in "u" with switched elements, and get the unique
% elements of "a", even with switched elements
idx = [];
[~,idx] = ismember(s,u,'rows');
u(idx,:) = [];
% (4) find rows with switched elements inside "a"
for i = 1 : size(s,1)
idx1 = [];
idx2 = [];
[~,idx1] = ismember(a,s(i,[1 2]),'rows');
[~,idx2] = ismember(a,s(i,[2 1]),'rows');
is{i} = find(idx1 + idx2);
end
And these are the results:
% In the command window, display the following:
% (1) the unique array, "u",
% (2) the rows with switched elements, "s", and
% (3) the indices of those rows with switched elements, "is".
>> u
u =
1 2
1 5
2 9
4 5
5 2
>> s
s =
2 1
5 1
>> is{1}
ans =
1
4
5
>> is{2}
ans =
3
7
8
10

More Answers (2)

Mitch Lautigar
Mitch Lautigar on 9 Jun 2022
There isn't a compact way to do this. You will need to build a for loop for each of the conditions you are attempting to do. I've put some code below to help you get started.
%unique rows
a1 = a; %built in for you to compare after the coding is done.
a_temp = []; %this is going to be the "stack" array.
for ii = 1:length(a)
a_temp = [a_temp;sort(a(ii,:))]; Sort all values for each individual row
ends
unique_a = unique(a_temp,'rows'); %gives you the unique rows
%duplicate rows & indices
[x,y] = size(unique_a) %I use this here because this is safer than using length here.
for ii = 1:x %use this for loop to index through the unique array
for jj = 1:length(a_temp) %use this for loop to index through the a_temp array
%the goal for you to do here is to index through the a_temp array finding where the arrays are equivalent.
%the 2 hints i'll give you: You can convert each array to string and use strcmpi(), or take each row from the unique array
%and compare it against all the rows of the a_temp array
end
end
  7 Comments

Sign in to comment.


Jan
Jan on 10 Jun 2022
Edited: Jan on 13 Jun 2022
Although I'm still not sure about the order of outputs, this is how I would solve the problem:
n = size(a, 1);
ua = zeros(size(a));
k = 0;
found = false(n, 1);
match = false(n, 1);
is1 = cell(n, 1);
is2 = cell(n, 1);
s = zeros(size(a));
is = 0;
for ia = 1:n
if ~found(ia)
a1 = a(ia, 1);
a2 = a(ia, 2);
match(:) = false;
switched = false;
for ib = ia + 1:n
if a(ib, 1) == a1 && a(ib, 2) == a2
match(ib) = true;
elseif a(ib, 2) == a1 && a(ib, 1) == a2
match(ib) = true;
if ~switched % [EDITED] collect s: First switched row
is = is + 1;
s(is, 1) = a2;
s(is, 2) = a1;
end
switched = true;
end
end
k = k + 1;
if any(match) % Row exists twice
match(ia) = true; % EDITED: ia added
if switched
ua(k, 1) = min(a1, a2);
ua(k, 2) = max(a1, a2);
is2{k} = find(match)];
else
ua(k, :) = a(ia, :);
is1{k} = find(match);
end
found = found | match; % Already checked, to be ignored later
else % Unique row:
ua(k, 1) = a1;
ua(k, 2) = a2;
end
end
end
ua = ua(1:k, :); % Crop unused output
s = s(1:is, :); % Crop unused output
% Maybe: is1 = is1(~cellfun('isempty', is1));
% Maybe: is2 = is2(~cellfun('isempty', is2));
I assume, this is not a perfect solution for you, but maybe a method to create an altertative approach which is possibly faster.
  3 Comments
Sim
Sim on 14 Jun 2022
Edited: Sim on 14 Jun 2022
Thanks a lot @Jan, very kind!
I get this output from your edited code ("is1" looks empty, but all the output related to the indices is contained in "is2"... but it is fine with me, thanks! :-) ):
% display results
ua
s
is1
is2{1}
is2{2}
ua =
1 2
4 5
1 5
5 2
2 9
s =
2 1
1 5
is1 =
0×1 empty cell array
ans =
1
4
5
ans =
3
7
8
10

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!