Should table Indexing be Faster?

11 views (last 30 days)
Paul
Paul on 15 Sep 2024
Commented: Paul on 18 Sep 2024
Example code.
t = combinations(0:10,0:10,0:10,0:10);
tic
for ii = 1:10
for jj = 1:height(t)
u = t{jj,:};
end
end
toc
Elapsed time is 18.111169 seconds.
tic
tcell = table2cell(t);
for ii = 1:10
for jj = 1:height(tcell)
u = [tcell{jj,:}];
end
end
toc
Elapsed time is 0.179943 seconds.
tic
tarr = table2array(t);
for ii = 1:10
for jj = 1:height(tarr)
u = tarr(jj,:);
end
end
toc
Elapsed time is 0.017899 seconds.
Any ideas why indexing into a table to extract data is so slow?
  1 Comment
Walter Roberson
Walter Roberson on 15 Sep 2024
By the way: we held a discussion of table indexing across rows, about two-ish years ago. I know that I contributed, and I seem to remember that Steven Lord contributed. The thread revived briefly a couple of months ago. Unfortunately I do not seem to be able to locate it at the moment.

Sign in to comment.

Accepted Answer

Matt J
Matt J on 16 Sep 2024
Edited: Matt J on 16 Sep 2024
I haven't profiled it, but I would bet that the following line, from @tabular/braceReference
b = t.extractData(varIndices);
is bottlenecking the row extraction operations. Effectively, this runs table2array(t) on the entire table t every time a braceReferencing operation is done.
That probably should be done differently, since as a result, the time for even the smallest row-extraction operation is a very strong function of the size of the table, see example below:
T = combinations(1:100,1:100,1:40,1:40);
t=T(1,:);
timeit(@() t{1,:})
ans = 1.4134e-04
timeit(@() T{1,:})
ans = 0.2923
  6 Comments
Matt J
Matt J on 18 Sep 2024
Edited: Matt J on 18 Sep 2024
From Tech Support:
Thank you for identifying a performance slowdown when extracting rows from a table. My colleagues are aware of the issue and are working on a fix. In the meantime, a workaround is to use parenthesis subscripting.
For example, currently you are extracting rows using the following syntax. Instead, extract rows using parentheses.
T = combinations(1:100,1:100,1:40,1:40);
t = T(1,:);
When I compared the elapsed time for both these syntaxes, the suggested workaround of parentheses was significantly faster than the original.
%Original example
tic, for i = 1:100, t1 = t{1,:}; end, toc
Elapsed time is 0.005140 seconds.
tic, for i = 1:100, t1 = T{1,:}; end; toc
Elapsed time is 4.458811 seconds.
% Workaround
tic, for i = 1:100, t1 = t(1,:); t1 = t1.Variables; end, toc
Elapsed time is 0.006460 seconds.
tic, for i = 1:100, t1 = T(1,:); t1 = t1.Variables; end; toc
Elapsed time is 0.005319 seconds.
Paul
Paul on 18 Sep 2024
Just want to point out that creating the temporary variable (t1 in this case) isn't necessary if the result is to be used directly, like as an input to a function
T = combinations(1:100,1:100,1:40,1:40);
T(1,:).Variables
ans = 1×4
1 1 1 1
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
Also, perhaps MathWorks should consider a near term update to use this exact workaround in rowfun when it's called with SeparateInputs=false

Sign in to comment.

More Answers (0)

Categories

Find more on Loops and Conditional Statements in Help Center and File Exchange

Products


Release

R2024b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!