How can I store large amounts of data from a text file into a mat file efficiently?

I've been struggling, trying to find the best way to read data that is stored in various text file formats and save that data into the MAT file format.
For example, I have a text file containing data as shown below. I have a few different formats that I'm working with, so this is just a generic, simplified, example.
foo1 = 1
foo2.x = 1, 2, 3, 4
foo2.y = 4, 3, 2, 1
foo3 = "dog"
I want to create a MAT file with the variables foo1, foo2, and foo3 stored in it. The text files have 30,000 or more variables in them.
I need to have the variables named the same as they are in the text files. I work with a lot of Simulink models and other scripts that expect to see the variable names as listed in the text file.
Currently, I:
  • read the whole text file using textread
  • parse the file and make strings with the variable name and values
  • use 'eval', 'evalin', or 'assignin' to create the variable in the workspace
  • make cell array of strings list with the variable names (named varlist)
  • save the variables listed in varlist to a mat file
Unfortunately, my method involves using 'eval', 'evalin', or 'assignin'. All three of these functions I dislike using but I can't think of another simple way of doing it.
Also 'eval' and 'evalin' get very slow when it is creating variables in a function workspace (not the base workspace). For example, my function takes 30 seconds when using 'evalin('base',...)' to create the variables in the base workspace but 900 seconds to use 'eval' to create the same variables in the function workspace. It pains me to be haphazardly creating variables in the base workspace but waiting 30x longer for the script to complete so I'm not using the base workspace isn't acceptable.
Another slow point of the function I've created, is where I save the data to a mat file. Using 'save(filename)' is pretty quick but when I specify the list of variables I want to save, it slows dramatically. For example 'save(filename, varlist{:})', where varlist is a cell array of strings containing a list of the variables I want to save, is very slow.
I'm nearly to the point where I'm going to start writing the MAT files directly and skip over ever creating the variables in the workspace. Unfortunately this seems really overkill, especially when I have something that sort of works.
Does anyone have any thoughts on how I can do this better?

 Accepted Answer

Doc on save says:
save(filename, '-struct', structName, fieldNames) stores the fields of the specified scalar structure as individual variables in the file. If you include the optional fieldNames, the save function stores only the specified fields of the structure. You cannot specify variables and the '-struct' keyword in the same call to save.
I would try
MyStruct.( variable_name ).( field_name ) = value;
etc.
and
save( filespec, '-struct', MyStruct )
Pro:
  • no eval, assigin, etc.
  • faster save - I assume
Con:
  • not tested
Example (/online help)
clc, clear all
s1.a = 12.7;
s1.b = {'abc', [4 5; 6 7]};
s1.c = 'Hello!';
save('newstruct.mat', '-struct', 's1');
disp('Contents of newstruct.mat:')
whos('-file', 'newstruct.mat')
clear('s1')
load('newstruct.mat')
prints
Contents of newstruct.mat:
Name Size Bytes Class Attributes
a 1x1 8 double
b 1x2 262 cell
c 1x6 12 char

More Answers (0)

Asked:

on 27 Nov 2012

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!