writetable takes forever - what is faster?

64 views (last 30 days)
cmo
cmo on 13 Aug 2015
Commented: Walter Roberson on 2 Oct 2023
I have a table with ~500,000 lines and ~20 columns. The table is mixed text and numbers. It is about 90 Mb as a text file.
It takes matlab FOREVER to write the table to a text file via the "writetable" function. I'm talking ~30 minutes.
Clearly, this is totally unacceptable.
How can I speed this up?
Please note - there are many (> 20) columns, and the number of them may change according to my data. So please do not suggest any manual solutions like "fprintf('%s\t%f\t%f')" etc.
  4 Comments
Jiri Hajek
Jiri Hajek on 16 Feb 2021
Just an update for the developers. This thread is visited ttwice as much as three years ago, btw...
I'm, using R2020b and still I've seen the same poor performance of writetable / writetimetable. If the table I'm trying to write to an Excel sheet has non-negligible size (say 10^4 rows or more), these functions become totally unuseable. It is still much faster to write the data to CSV files, open Excel manually and put the sheets together manually.
carlos edurdo condori ochoa
You can create a variable format so it can be adapted before you write the data. From your first row, convert to string array, then analyze it to now how many columns you will have to write, then create the format using the string values into the format that you want. Then you only need to use fprintf(fid, fmt, TheString)

Sign in to comment.

Answers (2)

per isakson
per isakson on 14 Aug 2015
Edited: per isakson on 14 Aug 2015
I've made a simple test with R2013b, 64bit, Win7, local SSD, and a spinning HD.
Some results
  • elapse time for writing increases linearly with size of the table variable
  • writing speed is approx. 0.15 MB/sec. EDIT: "speed" refers to the size of the table variable.
  • writing speed is practically the same with the HD
  • elapse time for 90MB would be approx. 10 minutes. EDIT: "90MB" refers to the size of the table variable.
>> [et,mb] = cssm(1e2)
et =
0.1496
mb =
0.0325
>> [et,mb] = cssm(1e3)
et =
1.4222
mb =
0.2287
>> [et,mb] = cssm(1e4)
et =
14.2710
mb =
2.1907
where
function [et,mb] = cssm( N )
str( N, 9 ) = 'z';
for jj = 1 : N
str(jj,:) = sprintf( 'Row%06d', jj );
end
sas.Name = str;
%
for jj = 1 : 20
sas.(sprintf('F%02d',jj)) = rand(N,1);
sas.(sprintf('S%02d',jj)) = char( randi( double('AZ'), [N,1] ) );
end
T = struct2table( sas );
sz = whos('T');
mb = sz.bytes/1e6;
tic, writetable( T, 'c:\m\cssm\T1.txt' ), et(1)=toc;
end
AFAIK: There is no faster, still user-friendly, alternative to writetable.
  3 Comments
per isakson
per isakson on 14 Aug 2015
Edited: per isakson on 14 Aug 2015
Thanks! Yes, my fault. I edited my answer.
Walter Roberson
Walter Roberson on 2 Oct 2023
For whatever it is worth:
On R2023b on my 2020 intel-based iMac running MacOS Sonoma operating system, @per isakson cssm for function for 500000 entries takes less then 24 seconds writing to text file, and less then 300 seconds writing to xlsx
Note: writetable() on MacOS never talks to Excel to write xlsx files; writetable() on Windows has the option of talking to Excel to write xlsx files (I seem to recall in the 2015 time frame that the question was originally asked, that the default was still to talk to Excel)

Sign in to comment.


Jan
Jan on 14 Aug 2015
You want me not to suggest fprintf('%s\t%f\t%f'), but of course this is the most direct and fastest solution. You can create the format string automatically based on the type of the data. So why do you hesitate to call fprintf?
  2 Comments
cmo
cmo on 17 Aug 2015
The format is subject to change, as the table is liable to have varying number of columns (depending on input data).
Walter Roberson
Walter Roberson on 17 Aug 2015
fmt = ['%s', repmat('\t%f', 1, NumNumericColumns)];
fprintf(fid, fmt, TheString, TheNumericVector);

Sign in to comment.

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!