Fast calculation of distances between two large arrays

5 views (last 30 days)
Dear MATLAB-Community,
I would like to calculate the distances between each entry in M (1 113 486 x 2) and N (1 960 000 x 2) and store the indices for the distances that are within a tolerance value tol. Can someone help me to do that efficiently? The below code takes 90 weeks. I have also tried [~, ind] = ismembertol(M, N, tol) which gives me logical 1 for every pair which does not make sense.
tol=0.5;
indM(size(M,1),1)=NaN;
indN(size(N,1),1)=NaN;
progressbar
for m=1:size(M,1)
for n=1:size(N,1)
if pdist2(M(m,1:2), N(n,1:2)) <= tol
indM(m)=m;
indN(n)=n;
else
indM(m)=NaN;
indN(n)=NaN;
end
end
progressbar(m/size(M,1))
end
Kind regards
Philipp
  2 Comments
Stephen23
Stephen23 on 24 Apr 2023
Edited: Stephen23 on 24 Apr 2023
"I have also tried [~, ind] = ismembertol(M, N, tol) which gives me logical 1 for every pair which does not make sense."
If you want to compare rows then you need to specify the ByRows option:
Also note that by default input tol is scaled to the data magnitude: set DataScale to 1 if you want to specify an the actual absolute tolerance.
PA
PA on 24 Apr 2023
Thanks for the answer. Yes, I should have considered this. However, by doing it, it still does not do what I want/expect. Is there another way?

Sign in to comment.

Accepted Answer

Chris
Chris on 24 Apr 2023
Edited: Chris on 24 Apr 2023
This should be a little bit quicker (my computer indicates ten hours).
tol = 0.5;
M = rand(1113486,2);
N = rand(1960000,2);
inds = cell(size(N,1),1);
for idx = 1:size(N,1)
close = pdist2(M,N(idx,:)) <= tol;
inds{idx} = find(close);
end
This would be a good candidate for GPU operations, if you have one.
if canUseGPU
tol = 0.5;
M = gpuArray(M);
N = gpuArray(N);
inds = cell(size(N,1),1);
for idx = 1:size(N,1)
close = pdist2(M,N(idx,:)) <= tol;
inds{idx} = find(close);
end
end
If your tolerance is loose relative to the density of your points -- that is, if you have a lot of distances<=tol, you may run into memory issues. In that case, inds should be a tall array.

More Answers (0)

Categories

Find more on Tables in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!