Indexing a range of a 3d matrix incredibly slow?

13 views (last 30 days)
I have a function that does several different operations between a source and destination matrix (each 3d), generally inserting a small source into a target area of a large destination. I need to run this function thousands-tens of thousands of times, so speed is valuable. However, when profiling I have found that virtuall all of the runtime is on the line that performs the final operation, where there's a lot of range indexing in 3d. example:
dest(dx,dy,dz) = source(sx,sy,sz) + dest(dx,dy,dz);
where all of di and si are ranges. That line alone takes ~99% of the runtime per call (taking ~6 seconds for 100 test runs). Is this sort of indexing operation inherently super slow? I find the slow speed odd, especially when binarizing each of those same 3d regions is ~30 times faster.
Is there a huge inefficiency i'm not seeing, or some trick to make this process faster?
EDIT: the rest of the code, where the ranges are generated
[d1, d2, d3]=size(dest);
[s1, s2, s3]=size(source);
%dest
dx=max(1,coord(1)):min(d1,coord(1)+s1-1);
dy=max(1,coord(2)):min(d2,coord(2)+s2-1);
dz=max(1,coord(3)):min(d3,coord(3)+s3-1);
%source
sx=max(-coord(1)+2,1):min(d1-coord(1)+1,s1);
sy=max(-coord(2)+2,1):min(d2-coord(2)+1,s2);
sz=max(-coord(3)+2,1):min(d3-coord(3)+1,s3);
note that the ranges are generated in order to find the overlap between the arrays, given some location coord relative to the destination array, that is still fully inside both arrays to avoid out-of-bounds problems.
  3 Comments
KSSV
KSSV on 13 Jun 2022
Show us your code snippet, how you are indexing.
Carson Purnell
Carson Purnell on 13 Jun 2022
binarizing those volumes, as in
dbin = imbinarize(dest(dx,dy,dz)); sbin = imbinarize(source(sx,sy,sz));
are the same ranges and that line runs dramatically faster than the slow sum operation.

Sign in to comment.

Answers (1)

Jan
Jan on 13 Jun 2022
If you post the relevant part of the code and some matching input data, it is very likely, that this forum can provide some improvements. With the currently givendetails, the best answer I know is:
If the memory access is the bottleneck of your code, it is the bottleneck of you code.
dest(dx,dy,dz) = source(sx,sy,sz) + dest(dx,dy,dz)
This line creates a copy of source(sx,sy,sz) ate first, than dest(dx,dy,dz), adds them and inserts them in dest. Indexing with vectors requires a range-check for each element.
Are the ranges sx, sy, ... contiguous? Then this would be much faster: source(sx(1):sx(end), sy(1):sy(end), sz(1):sz(end)) etc.
  3 Comments
Jan
Jan on 13 Jun 2022
Edited: Jan on 13 Jun 2022
Try this:
[d1, d2, d3]=size(dest);
[s1, s2, s3]=size(source);
dxi = max(1, coord(1))
dxf = min(d1, coord(1)+s1-1);
dyi = max(1, coord(2));
dyf = min(d2, coord(2)+s2-1);
dzi = max(1, coord(3));
dzf = min(d3, coord(3)+s3-1);
%source
... the same here...
dest(dxi:dxf, dyi:dyf, dzi:dzf) = ... same for source and dest
This saves the times for producing the index vectors. I do not expect that this causes a dramatic speedup.
It would be useful to have a minimal working example, which includes the bottleneck of your code. It is less smart, if I invent such a piece of code, because my guesses of the details might be misleading. But let me start:
dest = rand(1000, 1000, 281);
source = rand(4000, 2000, 312);
cord = [17, 81, 90];
Something like that?!?
Carson Purnell
Carson Purnell on 13 Jun 2022
I have been using these for testing, on a for loop of 100.
src = rand(20,20,10);
dest = zeros(500,500,100);
coord = [1 2 3];

Sign in to comment.

Categories

Find more on Graphics Performance in Help Center and File Exchange

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!