Import function; not able to index in for loop. How to save variables as filenames?
2 views (last 30 days)
Show older comments
Hello,
I have this script:
folder_name = uigetdir();
filenameExtension = '.txt';
for i = 1:13
filename = [folder_name, '\OD ', int2str(i), filenameExtension];
tmpData(i) = readtable(filename)
end
I want to import raw data from within a folder. All file names are OD 1.txt, OD 2.txt,... up to OD 13.txt.
The file content looks like this (905 rows):
OD 1,Time,Celsius(°C),Humidity(%rh),Dew Point(°C),Serial Number
1,2018-01-23 13:00:00,37.5,44.0,23.2,010201120
2,2018-01-23 13:05:00,36.0,48.5,23.4
3,2018-01-23 13:10:00,34.5,51.5,23.0
4,2018-01-23 13:15:00,35.0,51.0,23.3
5,2018-01-23 13:20:00,35.0,51.5,23.5
6,2018-01-23 13:25:00,33.5,55.0,23.2
7,2018-01-23 13:30:00,35.0,53.0,24.0
8,2018-01-23 13:35:00,36.5,47.5,23.5
9,2018-01-23 13:40:00,37.0,47.5,24.0
10,2018-01-23 13:45:00,36.5,47.0,23.4
11,2018-01-23 13:50:00,36.5,46.5,23.2
12,2018-01-23 13:55:00,36.0,49.5,23.8
13,2018-01-23 14:00:00,37.0,47.5,24.0
So this means it's actually a csv file within a txt file.
I want to import as a table. And I want the tablevariables in my workspace to represent the original filename. So OD1...etc.
My problem is in the for loop.
I need to use this format specifier (I guess) (& I can skip the serial number column):
formatSpec = '%f%{yyyy-MM-dd HH:mm:ss}D%f%f%f;
Can somebody help me on this one?
I get an error, because I cannot index with (i)... my loop reads all the datafiles as tables, but doesnt save them as individual variables in the workspace. How to do it?
Thank you in advance!
0 Comments
Answers (1)
Stephen23
on 28 Jan 2018
Edited: Stephen23
on 28 Jan 2018
"My problem is in the for loop."
Yes, it is.
"... my loop reads all the datafiles as tables, but doesnt save them as individual variables in the workspace. How to do it?"
That is your problem.
The simplest, neatest, and most efficient answer to this question is "don't".
Magically creating or accessing variables names in a loop is exactly how beginners force themselves into writing slow, buggy, pointlessly complex code that is hard to debug. Read this to know more about why:
Actually the fastest, neatest, and most efficient way for you to import that data into the MATLAB workspace would be to use indexing and one array: either an ND numeric array if the number of rows and columns are the same, or into a cell array if these are different. Using indexing is trivally simple to understand, is extremely efficient (unlike what you are trying to do), and simple to debug. I would reccomend that you do that because the filenames that you gave indicate a sequence 1, 2, ..., and that is exactly what indexing efficiently encodes:
C = cell(1,numel(files));
for k = 1:numel(files)
C{k} = load(...);
end
And then access the data simply using idexing:
C{1} % OD1
C{2} % OD2
So simple.
So efficient!
As an alternative which is less efficient than simple indexing (but still much more efficient than magically creating variable names) is to use the fields of a structure: e.g.:
S = struct();
for k = 1:numel(files)
fieldname = ... % generate fieldname from filename
S.(fieldname) = load(...)
end
And you can then access your data using
S.OD1
S.OD2
...etc
I would recommend that you use indexing.
3 Comments
Stephen23
on 29 Jan 2018
Edited: Stephen23
on 29 Jan 2018
opt = {'Delimiter',','};
fmt = '%*d%s%f%*f%*f'; % keep 2nd and 3rd columns only.
D = uigetdir();
%D = '.'; % current directory.
S = dir(fullfile(D,'OD*.txt')); % get the file names.
N = numel(S); % count how many files there are.
C = cell(N,1); % preallocate the output cell array.
for k = 1:N % for each file...
[fid,msg] = fopen(fullfile(D,S(k).name),'rt');
assert(fid>=3,msg) % check if file exists.
hdr = fgetl(fid); % read header line and discard.
one = textscan(fid,[fmt,'%*f'],1,opt{:}); % read first data line.
two = textscan(fid,fmt,opt{:}); % read remaining data lines.
C{k} = cellfun(@vertcat,one,two,'uni',0); % join data together.
fclose(fid);
end
C = vertcat(C{:}); % create Nx2 cell (or however many columns you read).
R = cellfun('size',C(:,1),1); % rows of data for each file.
Y = cell2mat(arrayfun(@(n,r)n*ones(r,1),(1:N)',R,'uni',0));
X = vertcat(C{:,1});
Z = vertcat(C{:,2});
% createTEMPgraph(X,Y,Z)
This reads the timestamp and Celsius columns into one cell array C (exactly as you were advised). The contents of C are then simply concatenated together using vertcat: notice how this is much simpler than that ugly script that you have been given with all of those hard-coded variable names (ugh!), and note also that this code will automatically scale to any number of input files (not a fixed number like that badly written script): easily scaling to any number of files is one of the many advantages of using arrays.
Reading the file is a little bit fiddlier than usual because the first line of data is longer than the others. I got around this by simply reading that line separately (into variable one) and then joining the required data together with the rest of the lines (variable two).
When run on the test files (attached to this comment) this code produces the following variables (which match exactly the values that are in the test files):
>> X
X =
'2018-01-23 13:00:00'
'2018-01-23 13:05:00'
'2018-01-23 13:10:00'
'2018-01-23 13:15:00'
'2018-01-23 13:20:00'
'2018-01-24 13:00:00'
'2018-01-24 13:05:00'
'2018-01-24 13:10:00'
'2018-01-24 13:15:00'
'2018-01-25 13:00:00'
'2018-01-25 13:05:00'
'2018-01-25 13:10:00'
'2018-01-25 13:15:00'
>> Y
Y =
1
1
1
1
1
2
2
2
2
3
3
3
3
>> Z
Z =
17.500
16.000
14.500
15.000
15.000
27.500
26.000
24.500
25.000
38.500
38.000
38.500
38.000
It would be easy to adapt this to using readtable and datetime objects, as you require.
See Also
Categories
Find more on Data Type Conversion in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!