Efficient way to identify duplicate edges.

4 views (last 30 days)
Hello,
I am having trouble coming up with efficient method to identify duplicate edges given the edge list.
Assume we have the following edge list:
SampleEdgeList=[9 8;1 3;4 6;7 3;2 4;3 1;]
First column represent starting nodes and the second column represent corresponding ending nodes. I am working with undirected edges, therefore, second row [1 3] and the sixth row [3 1] means the same thing. As a result, I have duplicate edges connecting node 1 and node 3.
I have EdgeList containing millions of edges, therefore, I would like to avoid for-loops and come up with the most efficient way to identify those duplicate edges.
Ultimately, I would like to regenerate one-end of the nodes for all the duplicate edges, to eliminate all duplicate edges.
Thanks for the help in advance!
Louis

Accepted Answer

the cyclist
the cyclist on 8 Aug 2013
Edited: the cyclist on 8 Aug 2013
uniqueEdgeList = unique(sort(SampleEdgeList,2),'rows')
You can also use the second and third outputs of the unique() function to know which of the original rows map to the unique rows, and vice versa. See
>> doc unique
for details.
  4 Comments
TR RAO
TR RAO on 18 Jan 2018
fileID = fopen('C:\Users\TR RAO\Desktop\rao1.txt','r'); C = textscan(fileID, '%s %s'); fclose(fileID); d1=cellstr(C{1,1});d2=cellstr(C{1,2}); G=graph(d1,d2); A=adjacency(G) full(A); But this is not working for duplicate edges
Steven Lord
Steven Lord on 18 Jan 2018
TR RAO, see Christine Tobler's comment on Anurag's answer.

Sign in to comment.

More Answers (1)

Anurag Passi
Anurag Passi on 20 Apr 2016
I have the same question. However, my edges are in a cell array and are text. For example 'A0A023PXA5' 'O23144' 'A0A023PXP4' 'O23171' 'A0A023PYF7' 'O23171'
and sort (DIM and MODE) do not work on cell arrays. I have also tried to build source and target as separate cell arrays to use
G= graph(s,t)
But i am still getting the same error as:
Error using matlab.internal.graph.MLGraph Duplicate edges not supported.
Error in matlab.internal.graph.constructFromEdgeList (line 125) G = underlyingCtor(double(s), double(t), totalNodes);
Error in matlab.internal.graph.constructFromTable (line 40)
what am I doing wrong.
  2 Comments
Christine Tobler
Christine Tobler on 2 Nov 2016
I realize this is late, but I'll add it in case it's still helpful: sort and unique along the rows are not supported for cell arrays of character vectors (cellstr), but is supported on the new string class. So you can do the following:
edgesString = string([s(:), t(:)]);
edgesUnique = cellstr(unique(edgesString, 'rows'))
g = graph(edgesUnique(:, 1), edgesUnique(:, 2));
It's not very nice-looking, but it should work.
TR RAO
TR RAO on 19 Jan 2018
My data is (1,2) (1, 3) (1,4) (3,1)(2, 1) edgesUnique = cellstr(unique(edgesString, 'rows')) is working. But g = graph(edgesUnique(:, 1), edgesUnique(:, 2)); is not working for thisdata.

Sign in to comment.

Categories

Find more on Graph and Network Algorithms in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!