How to choose 60 unique coordinates ('A') from 600 coordinates('B'), nearest to 60 other coordinates('C')
3 views (last 30 days)
Show older comments
Hello,
I have a question. Imagine we have an array 'C' containing 60*2 elements and another vector 'B' containing 600*2 elements. Each of these 2 arrays represent 'x' and 'y' coordinates of our points. I wish to choose 60 nearest coordinates from 'B' in a way that these coordinates( array 'A') be unique and nearest to coordinates in 'C'.
I want to do this with highest speed because I have to implement it multiple times. Definitely using 'for loop' would increase complexity so I want to avoid it.
Any help would be appreciated.
Thank you
0 Comments
Answers (3)
John D'Errico
on 21 Oct 2024
Edited: John D'Errico
on 21 Oct 2024
As I pointed out to @Voss and @Star Strider, if the points in B mst be unique, then there is an issue. If you simply choose the closest points (where it might be best to use knnsearch anyway) then one of the points in B may be simultaneously closest to different points in C.
The simple answer is to test to see if there are any replicate points chosen, where point k from B is simultaneously closest to two (or more!) points in C. But then you need to decide which other point to add to the set. I would probably do this by a resampling scheme, where all of the points already chosen are effectively removed from B. Now find a new point, which is the next closest to the point from C which was impacted. The code would not be too complex, and it would be reasonably efficient.
I'll give an example. Here, I'll use a smaller set of points to make thigns easier to visualize, and also to make it more likely there will be a conflict, I have made the size of B smaller. Larger sets will of course reduce the probability of a failure.
C = rand(10,2);
B = rand(15,2);
[idxB,D] = knnsearch(B,C,K=1)
In this set, you can see that point # 1 from B was simultaneously closest to 4 different points in C. And point 13 was also duplicated in that list. We could test to see if there were any replicates simply enough. For example,
if numel(unique(idxB)) < size(C,1)
% stuff
end
But now there is an ambiguity. Point #1 from B was closest to point 7 in C. Do we now choose a second best point from B for points [2 3 8] in C? What if there is again an issue? Is there some optimal set which is "best" in some unspecified way? And this is where your problem specification fails, because it does not suggest how these issues must be addressed.
Will this be a frequent problem? SURE AS HELL YES! Consider this simulation.
failcount = 0;
for i = 1:100
C = rand(60,2);
B = rand(600,2);
[idxB,D] = knnsearch(B,C,k=1);
if numel(unique(idxB)) ~= size(C,1)
failcount = failcount + 1;
end
end
failcount
failcount =
99
So out of 100 random sets generated, 99 of them had a conflict. If uniqueness is an issue, then you WILL need to write code that worries about these conflicts.
Anyway, COULD you resolve all conflicts using a loop? Well, yes.
C = rand(60,2);
B = rand(600,2);
idxB = NaN(size(C,1),1);
Btemp = B;
for i = 1:size(C,1)
idxB(i) = knnsearch(Btemp,C(i,:),k=1);
Btemp(idxB(i),:) = inf;
end
numel(unique(idxB))
ans =
60
And clearly this will result in a unique set, that has no conflicts.
0 Comments
Voss
on 21 Oct 2024
Edited: Voss
on 21 Oct 2024
C = rand(60,2);
B = rand(600,2);
[~,idx] = min(sum((permute(C,[1 3 2])-permute(B,[3 1 2])).^2,3),[],2);
A = B(idx,:);
figure
hold on
plot(B(:,1),B(:,2),'.b')
plot(C(:,1),C(:,2),'xr')
plot(A(:,1),A(:,2),'og')
A contains the nearest point in B to each point in C. This doesn't guarantee unique rows of A.
1 Comment
John D'Errico
on 21 Oct 2024
And because it does not insure uniqueness, it fails to solve the problem posed.
Star Strider
on 21 Oct 2024
C = rand(60,2);
B = rand(600,2);
[A,I] = pdist2(B,C,'euclidean','smallest',1)
figure
scatter(C(:,1), C(:,2), '.b', 'DisplayName','C')
hold on
scatter(B(:,1), B(:,2), '.g', 'DisplayName','B')
scatter(B(I,1), B(I,2), 'or', 'DisplayName','Closest ‘B’ To ‘C’')
hold off
axis('equal')
legend('Location','northoutside', 'Orientation','horizontal')
.
3 Comments
Star Strider
on 21 Oct 2024
To ‘guarantee uniqueness’ implies to me looping through all the points in both matrices in order to determine if any points in ‘C’ are equidistant from more than one point in ‘B’. It would certainly be possible to do that, however it would require a loop and a significant amount of time.
Another option would be to return two distances for each ‘C’ point to determine if any distances are the same. If none of them are, that is likeley as close as you can get to ‘guarantee uniqueness’.
Asking the find function to return the indices of any columns where the distances are the same, in this instance returns an empty vector. So the top row of ‘I’ returns the indices of the unique rows of ‘C’ that are closest to elements of ‘B’, and the top row of ‘D’ returns their distances.
C = rand(60,2);
B = rand(600,2);
[D,I] = pdist2(B,C,'euclidean','smallest',2)
A = B(I(1,:),:);
% disp(A) % Optional
notunique = find(any(diff(D)==0,1))
Check = D(:,notunique)
figure
scatter(C(:,1), C(:,2), '.b', 'DisplayName','C')
hold on
scatter(B(:,1), B(:,2), '.g', 'DisplayName','B')
scatter(A(:,1), A(:,2), 'or', 'DisplayName','‘A’ Closest ‘B’ To ‘C’')
hold off
axis('equal')
legend('Location','northoutside', 'Orientation','horizontal')
figure
stem3(A(:,1), A(:,2), D(1,:), Marker='.')
grid on
xlabel('A(:,1)')
ylabel('A(:,2)')
zlabel('D')
Quod erat demonstrandum!
.
John D'Errico
on 21 Oct 2024
Edited: John D'Errico
on 21 Oct 2024
Seems simple enough.
C = rand(60,2);
B = rand(600,2);
idxB = NaN(size(C,1),1);
Btemp = B;
for i = 1:size(C,1)
idxB(i) = knnsearch(Btemp,C(i,:),k=1);
Btemp(idxB(i),:) = inf;
end
numel(unique(idxB))
Just use a tool like pdist2 (or knnsearch) to find the closest element of B to each element of C, one at a time. Then zap out the element of B that you just found, effectively removing it from the search.
But since uniqueness was stated to be important, AND since almost always, a call to pdist2 or knnsearch will almost always experience at least a few conflicts, you need to do something.
See Also
Categories
Find more on Multidimensional Arrays in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!