How to find elements in an array faster / without using for loop?

Hi,
I have the following working code with a for loop but I want to make the process faster. For the sizes of arrays I use this process now takes up to 30 seconds.
The code:
neighbour is a X by 2 array with integers only (for example 65000 x 2)
squares is a Y by 4 array with integers only (for example 35000 x 4)
B = zeros(squares,1); %the preallocation I tried - not much helpful, minimal time saving
for i = 1:length(neighbour) % for loop going though values from 1 to length of 'neighbour' array ~ for example 1:65000
B = any(squares == (neighbour(i,1)),2) & any(squares == (neighbour(i,2)),2);
% this finds indicies of lines in 'squares' where there are both values from 'i'th row of 'neighbour' array
end
If not clear from the code what I want to do is:
I want to go though the 'neighbour' array row by row and obtain the indicies of lines in 'square' array which contain the values as in that row in neighbour array.
Example:
if the neighbour array had only 1 row with
[1 2]
in it, and the square array looked like this:
[ 4 58 6 7;
1 2 47 48;
84 12 8 9],
then the output should be the index of the line in square array which contains both numbers i.e.
2
I have tried preallocation but the time it saves is marginal. Do you have any ideas on how to make this faster, ideally without a for loop?
Many thanks,
Jan

 Accepted Answer

In the example you provided you aren't actually storing any of the indices. It is also important to consider that you will get results where more than one neighbour matches a square, or none match a square at all - you're going to need to store the indices in a cell array. To go through some different approaches, lets first generate some similar data:
neighbour = randi(1000,65000,2);
squares = randi(1000,35000,4);
B = cell(height(squares),1);
I've done three approaches to the problem the first one being based on the example you provided.
Approach 1 based on the example you provided:
tic
for i = 1:length(neighbour)
B{i} = find( ...
any(squares == (neighbour(i,1)),2) &...
any(squares == (neighbour(i,2)),2)...
);
end
toc
% Elapsed time is 33.780513 seconds. (Mathworks Server)
% Elapsed time is 21.165682 seconds. (My PC)
It's taking about 21 seconds on my computer, we can get some improvent with approach 2.
Approach 2 Instead of using &, it's faster to index into squares with the first logical expression. In this way, you're only scanning a portion of squares for neighbour(i,2) instead of the whole array, that is a significant improvement. In a sense, this is the vector equivalent of logical short-circuiting.
B = cell(height(squares),1);
tic
for i = 1:length(neighbour)
idx = find(any(squares == (neighbour(i,1)),2));
B{i} = idx(any(squares(idx,:) == (neighbour(i,2)),2));
end
toc
% Elapsed time is 21.825993 seconds. (Mathworks Server)
% Elapsed time is 12.312727 seconds. (My PC)
Approach 3 It turns out that any(someArray,1), is faster than any(someArray,2), which isn't surpising as MATLAB is column major-order. With some modification we can get another improvement.
B = cell(height(squares),1);
tic
squares = squares.';
for i = 1:length(neighbour)
idx = find(any(squares == neighbour(i,1),1));
B{i} = idx(any(squares(:, idx) == (neighbour(i,2)),1)).';
end
toc
%Elapsed time is 11.630071 seconds. (Mathworks Server)
%Elapsed time is 6.896441 seconds. (My PC)
So for me that was about a 3x improvement, and it does about a 3x improvement on MathWorks servers as well.
Edit: find needed to be used on the first logical expression in approaches 2 and 3.

4 Comments

Hi Turlough,
Firstly, thank you for your answer - it is definitely very helpful. Although I think I understand what approaches 2 & 3 do, I am not 100% sure I understand their output.
Approach 1 outpus the indicies of the rows in square array where there are the common values with neighbour array. Exactly as I have shown in the example section of my question above.
Approach 2 (&3) does however only output combinations of values 1,2,3 and 4.
Could you please help me on what are those value? What do they represent in terms of the 'sqaure' array?
Many Thanks,
Jan
I believe I know what the numbers mentioned above mean. They are the indicies of the already 'filtered' rows (those in which the first value from 'neighbour' appears) in which the second value from 'neighbour' appears.
I don't know however how to obtain the index of rows in the original 'un-filtered' square array
Jan, that was a mistake, I should have used find on the first logical expression. Approaches 2 and 3 now match up with approach 1. I've retimed everything in the edit above.
Thank you for that! I really appreciate your help.
Jan

Sign in to comment.

More Answers (1)

Hi Jan,
The 'ismember' function should be able to do this!
Christopher

5 Comments

Hi, thank you for your response. I was looking at 'ismember' function earlier.. do you think it will speed up the process? can it somehow be implemented without using a for loop? Jan
Hi Jan,
I’m on mobile here so please excuse the formatting. I suspect that ismember should be significantly quicker as it will us matrix based operations rather than element by element.
Try; [Lia,Locb] = ismember(Neighbour,Square,rows)
This should return which elements are present and their location. Adding in ‘rows’ will ensure that the entirety of Neighbour is used and not just one element.
Also for speed comparison you can use
tic
%code you want to time here
toc
And matlab will return a runtime in the command window. Hope this helps, Christopher
Hi Christopher,
no worries about the formatting. 'ismember' looks very promissing, but it seems that 'rows' option requires both arrays to have the same number of columns to work. Any other idea on how to both check that the complete row from array 'neighbour' is present (and where) in 'square' array?
Thanks, Jan
Hi Jan,
The fact that your data isn't the same size is a little problematic! I would suggest zero padding but I think this will cause more problems as neighbour would probably never be found in the larger matrix (square).
One option might be to use reshape() to reshape square to the same size as neighbour. I think re-arrainging neigbhour and using [Lia,Locb] = ismember(Neighbour,Square,’rows’) will probably be your best bet.
Let me know how you get on,
Christopher
Hi Christopher,
I know, the size difference is not making this easier at all. The issue is that each row of the 'square' array represents 4 verticies of a square. That's why I need to keep it in this format without re-arranging.
The aim of these lines of the code is basically to find which 'walls' (rows of the neighbour array) correspond to which 'sqaure' (rows of the square matrix) by matching the values.
Thanks,
Jan

Sign in to comment.

Categories

Products

Release

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!